Zep’s syntax highlighting system is designed to be extensible, supporting multiple languages through a pluggable architecture. Syntax highlighting runs asynchronously to avoid blocking the editor during updates.
ZepSyntax Base Class
From include/zep/syntax.h:51-77:
class ZepSyntax : public ZepComponent
{
public:
ZepSyntax ( ZepBuffer & buffer ,
const std :: unordered_set < std :: string > & keywords = {},
const std :: unordered_set < std :: string > & identifiers = {},
uint32_t flags = 0 );
virtual ~ZepSyntax ();
virtual SyntaxResult GetSyntaxAt ( const GlyphIterator & index ) const ;
virtual void UpdateSyntax ();
virtual void Interrupt ();
virtual void Wait () const ;
virtual long GetProcessedChar () const
{
return m_processedChar;
}
virtual void Notify ( std :: shared_ptr < ZepMessage > payload ) override ;
const NVec4f & ToBackgroundColor ( const SyntaxResult & res ) const ;
const NVec4f & ToForegroundColor ( const SyntaxResult & res ) const ;
Syntax highlighting operates on the buffer asynchronously. The m_processedChar atomic tracks how much of the buffer has been analyzed.
Syntax Result
Syntax information is returned as a SyntaxResult:
From include/zep/syntax.h:37-48:
struct SyntaxData
{
ThemeColor foreground = ThemeColor ::Normal;
ThemeColor background = ThemeColor ::None;
bool underline = false ;
};
struct SyntaxResult : SyntaxData
{
NVec4f customBackgroundColor;
NVec4f customForegroundColor;
};
Theme Colors
Syntax uses theme colors that adapt to the active theme:
enum class ThemeColor
{
Normal , // Default text
Keyword , // Language keywords (if, while, class)
Identifier , // Known identifiers
Number , // Numeric literals
String , // String literals
Comment , // Comments
Whitespace , // Spaces and tabs
Parenthesis , // Brackets and braces
Background , // Editor background
// ... many more
};
Using theme colors instead of hardcoded RGB values ensures your syntax highlighting adapts when users switch between light and dark themes.
How Syntax Highlighting Works
Initialization
From src/syntax.cpp:15-29:
ZepSyntax :: ZepSyntax (
ZepBuffer & buffer,
const std ::unordered_set < std ::string >& keywords,
const std ::unordered_set < std ::string >& identifiers,
uint32_t flags)
: ZepComponent ( buffer . GetEditor ())
, m_buffer (buffer)
, m_keywords (keywords)
, m_identifiers (identifiers)
, m_stop ( false )
, m_flags (flags)
{
m_syntax . resize ( m_buffer . GetWorkingBuffer (). size ());
m_adornments . push_back ( std :: make_shared < ZepSyntaxAdorn_RainbowBrackets >( * this , m_buffer));
}
Each character in the buffer has a corresponding SyntaxData entry.
Update Process
From src/syntax.cpp:145-333:
void ZepSyntax :: UpdateSyntax ()
{
auto & buffer = m_buffer . GetWorkingBuffer ();
auto itrCurrent = buffer . begin () + m_processedChar;
auto itrEnd = buffer . begin () + m_targetChar;
// Walk backwards to previous delimiter
while (itrCurrent > buffer . begin ())
{
if ( std :: find ( delim . begin (), delim . end (), * itrCurrent) == delim . end ())
{
itrCurrent -- ;
}
else
{
break ;
}
}
// Parse tokens and apply colors
while (itrCurrent != itrEnd)
{
if (m_stop == true )
return ;
// Find next token
auto itrFirst = buffer . find_first_not_of (itrCurrent, buffer . end (),
delim . begin (), delim . end ());
auto itrLast = buffer . find_first_of (itrFirst, buffer . end (),
delim . begin (), delim . end ());
auto token = std :: string (itrFirst, itrLast);
// Check against keyword/identifier sets
if ( m_keywords . find (token) != m_keywords . end ())
{
mark (itrFirst, itrLast, ThemeColor ::Keyword, ThemeColor ::None);
}
else if ( m_identifiers . find (token) != m_identifiers . end ())
{
mark (itrFirst, itrLast, ThemeColor ::Identifier, ThemeColor ::None);
}
// ... more token classification
}
}
Token Classification Logic
The base syntax highlighter classifies tokens:
Keywords - Matches against the keyword set
Identifiers - Matches against the identifier set
Numbers - Only contains digits 0-9
Strings - Enclosed in quotes with escape handling
Comments - Starts with // or ; (for Lisp)
Parentheses - Brackets and braces
Whitespace - Spaces and tabs
Normal - Everything else
Registering Syntax Providers
Syntax providers are registered by file extension:
From include/zep/editor.h:210-216:
using tSyntaxFactory = std :: function < std :: shared_ptr < ZepSyntax >( ZepBuffer * )>;
struct SyntaxProvider
{
std ::string syntaxID;
tSyntaxFactory factory = nullptr ;
};
From include/zep/editor.h:321:
void RegisterSyntaxFactory ( const std :: vector < std :: string > & mappings ,
SyntaxProvider factory );
Example Registration
// Register C++ syntax
editor . RegisterSyntaxFactory (
{ ".cpp" , ".h" , ".hpp" , ".cc" , ".cxx" },
SyntaxProvider{
"cpp" ,
[]( ZepBuffer * buffer ) {
return std :: make_shared < ZepSyntax_CPP >( * buffer);
}
}
);
// Register Python syntax
editor . RegisterSyntaxFactory (
{ ".py" },
SyntaxProvider{
"python" ,
[]( ZepBuffer * buffer ) {
return std :: make_shared < ZepSyntax_Python >( * buffer);
}
}
);
When a buffer is loaded, Zep checks the file extension and creates the appropriate syntax highlighter automatically.
Syntax Flags
From include/zep/syntax.h:27-35:
namespace ZepSyntaxFlags
{
enum
{
CaseInsensitive = ( 1 << 0 ), // Keywords are case-insensitive
IgnoreLineHighlight = ( 1 << 1 ), // Don't highlight current line
LispLike = ( 1 << 2 ) // Use Lisp-style syntax rules
};
};
Flags affect parsing behavior:
// For a case-insensitive language like SQL
auto syntax = std :: make_shared < ZepSyntax >(
buffer,
keywords,
identifiers,
ZepSyntaxFlags ::CaseInsensitive
);
// For Lisp-family languages
auto syntax = std :: make_shared < ZepSyntax >(
buffer,
keywords,
identifiers,
ZepSyntaxFlags ::LispLike
);
Asynchronous Updates
Buffer Change Notifications
From src/syntax.cpp:110-142:
void ZepSyntax :: Notify ( std :: shared_ptr < ZepMessage > spMsg )
{
if ( spMsg -> messageId == Msg ::Buffer)
{
auto spBufferMsg = std :: static_pointer_cast < BufferMessage >(spMsg);
if ( spBufferMsg -> pBuffer != & m_buffer)
return ;
if ( spBufferMsg -> type == BufferMessageType ::PreBufferChange)
{
Interrupt (); // Stop current highlighting
}
else if ( spBufferMsg -> type == BufferMessageType ::TextDeleted)
{
Interrupt ();
m_syntax . erase ( m_syntax . begin () + spBufferMsg -> startLocation . Index (),
m_syntax . begin () + spBufferMsg -> endLocation . Index ());
QueueUpdateSyntax ( spBufferMsg -> startLocation , spBufferMsg -> endLocation );
}
else if ( spBufferMsg -> type == BufferMessageType ::TextAdded)
{
Interrupt ();
m_syntax . insert ( m_syntax . begin () + spBufferMsg -> startLocation . Index (),
ByteDistance ( spBufferMsg -> startLocation ,
spBufferMsg -> endLocation ),
SyntaxData{});
QueueUpdateSyntax ( spBufferMsg -> startLocation , spBufferMsg -> endLocation );
}
}
}
Thread Safety
From src/syntax.cpp:73-82:
void ZepSyntax :: Interrupt ()
{
// Stop the thread, wait for it
m_stop = true ;
if ( m_syntaxResult . valid ())
{
m_syntaxResult . get ();
}
m_stop = false ;
}
Syntax updates can be interrupted if the buffer changes. The atomic m_stop flag allows clean cancellation of in-progress highlighting.
Secondary Syntax (Adornments)
Zep supports “adornments” - secondary syntax highlighters that overlay the primary:
From include/zep/syntax.h:100-115:
class ZepSyntaxAdorn : public ZepComponent
{
public:
ZepSyntaxAdorn ( ZepSyntax & syntax , ZepBuffer & buffer )
: ZepComponent ( syntax . GetEditor ())
, m_buffer (buffer)
, m_syntax (syntax)
{
}
virtual SyntaxResult GetSyntaxAt ( const GlyphIterator & offset ,
bool& found ) const = 0 ;
protected:
ZepBuffer & m_buffer;
ZepSyntax & m_syntax;
};
Rainbow Brackets
Zep includes a rainbow bracket adornment:
From src/syntax.cpp:28:
m_adornments . push_back (
std :: make_shared < ZepSyntaxAdorn_RainbowBrackets >( * this , m_buffer)
);
Rainbow brackets color matching pairs of brackets differently to aid readability.
Creating a Custom Syntax Highlighter
Step 1: Define Keywords
class ZepSyntax_MyLanguage : public ZepSyntax
{
public:
ZepSyntax_MyLanguage ( ZepBuffer & buffer )
: ZepSyntax (buffer,
GetKeywords (),
GetIdentifiers (),
0 ) // flags
{
}
static std :: unordered_set < std :: string > GetKeywords ()
{
return {
"function" , "if" , "else" , "while" , "for" ,
"return" , "var" , "const" , "let" ,
"class" , "new" , "this" , "super"
};
}
static std :: unordered_set < std :: string > GetIdentifiers ()
{
return {
"console" , "window" , "document" ,
"Array" , "Object" , "String" , "Number"
};
}
};
Step 2: Override UpdateSyntax (Optional)
For more complex syntax (like multi-line comments), override UpdateSyntax():
void ZepSyntax_MyLanguage :: UpdateSyntax () override
{
// Call base implementation first
ZepSyntax :: UpdateSyntax ();
// Add custom syntax rules
auto & buffer = m_buffer . GetWorkingBuffer ();
// Find multi-line comments /* */
auto itr = buffer . begin ();
while (itr != buffer . end ())
{
if ( * itr == '/' && (itr + 1 ) != buffer . end () && * (itr + 1 ) == '*' )
{
auto start = itr;
itr += 2 ;
// Find closing */
while (itr != buffer . end ())
{
if ( * itr == '*' && (itr + 1 ) != buffer . end () && * (itr + 1 ) == '/' )
{
itr += 2 ;
mark (start, itr, ThemeColor ::Comment, ThemeColor ::None);
break ;
}
itr ++ ;
}
}
else
{
itr ++ ;
}
}
}
Step 3: Register with Editor
editor . RegisterSyntaxFactory (
{ ".mylang" , ".ml" },
SyntaxProvider{
"mylanguage" ,
[]( ZepBuffer * buffer ) {
return std :: make_shared < ZepSyntax_MyLanguage >( * buffer);
}
}
);
Getting Syntax at Cursor
To get syntax information for a specific location:
From src/syntax.cpp:36-63:
SyntaxResult ZepSyntax :: GetSyntaxAt ( const GlyphIterator & offset ) const
{
SyntaxResult result;
Wait (); // Wait for syntax update to complete
if (m_processedChar < offset . Index () ||
( long ) m_syntax . size () <= offset . Index ())
{
return result; // Not processed yet
}
result . background = m_syntax [ offset . Index ()]. background ;
result . foreground = m_syntax [ offset . Index ()]. foreground ;
result . underline = m_syntax [ offset . Index ()]. underline ;
// Check adornments (like rainbow brackets)
bool found = false ;
for ( auto & adorn : m_adornments)
{
auto adornResult = adorn -> GetSyntaxAt (offset, found);
if (found)
{
result = adornResult;
break ;
}
}
return result;
}
Adornments take priority over base syntax, allowing features like rainbow brackets to override default bracket colors.
Incremental Updates Only re-highlight changed regions, not the entire file
Asynchronous Highlighting runs in background without blocking input
Interruptible In-progress highlighting stops when buffer changes
Per-Character Cache Each character’s syntax is cached for fast lookups
Best Practices
Keep keyword sets small - Large sets slow down token matching
Use keyword sets - Don’t implement custom string matching for common keywords
Minimize regex - String operations are faster than regex for simple patterns
Test large files - Ensure highlighting performs well on 10,000+ line files
Built-in Syntax Highlighters
Zep includes syntax highlighters for several languages:
C/C++ - Full C++ syntax with preprocessor support
CMake - CMake language support
Markdown - CommonMark support with heading styles
GLSL - OpenGL Shading Language
Lisp/Scheme - S-expression aware highlighting
Check the src/syntax_*.cpp files for implementation examples.
Next Steps
Buffers Understand how syntax data is stored
Display Layer Learn how syntax colors are rendered