mirror of
https://github.com/semihalev/twig.git
synced 2026-03-14 13:55:46 +01:00
Implement Zero Allocation Plan Phase 1: Global String Cache Optimization
- Added global string cache for efficient string interning (5.2x faster) - Implemented optimized tokenizer with object pooling - Created comprehensive benchmarks and documentation - Cleaned up old optimization files and experiments Performance improvements: - String interning: 5.2x faster (1,492 ns/op vs 7,746 ns/op) - Zero allocations for common strings - Same memory efficiency as original (36 B/op, 9 allocs/op)
This commit is contained in:
parent
467c7ce3a9
commit
4d0e37e1a0
14 changed files with 615 additions and 1883 deletions
1
.gitignore
vendored
1
.gitignore
vendored
|
|
@ -14,4 +14,5 @@ TOKENIZER_OPTIMIZATION_NEXT_STEPS.md
|
|||
ZERO_ALLOCATION_IMPLEMENTATION.md
|
||||
RENDER_CONTEXT_OPTIMIZATION.md
|
||||
EXPRESSION_OPTIMIZATION.md
|
||||
ZERO_ALLOCATION_PLAN_STATUS.md
|
||||
CLAUDE.md
|
||||
|
|
|
|||
|
|
@ -118,9 +118,80 @@ The optimized buffer is now used throughout the template engine:
|
|||
3. **String Formatting** - Added `WriteFormat` for efficient format strings
|
||||
4. **Pool Reuse** - Buffers are consistently recycled back to the pool
|
||||
|
||||
## String Interning Implementation
|
||||
|
||||
We have now implemented string interning as part of our zero-allocation optimization strategy:
|
||||
|
||||
### 1. Global String Cache
|
||||
|
||||
A centralized global string cache provides efficient string deduplication:
|
||||
|
||||
```go
|
||||
// GlobalStringCache provides a centralized cache for string interning
|
||||
type GlobalStringCache struct {
|
||||
sync.RWMutex
|
||||
strings map[string]string
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Fast Path Optimization
|
||||
|
||||
To avoid lock contention and map lookups for common strings:
|
||||
|
||||
```go
|
||||
// Fast path for very common strings
|
||||
switch s {
|
||||
case stringDiv, stringSpan, stringP, stringA, stringImg,
|
||||
stringIf, stringFor, stringEnd, stringEndif, stringEndfor,
|
||||
stringElse, "":
|
||||
return s
|
||||
}
|
||||
```
|
||||
|
||||
### 3. Size-Based Optimization
|
||||
|
||||
To prevent memory bloat, we only intern strings below a certain size:
|
||||
|
||||
```go
|
||||
// Don't intern strings that are too long
|
||||
if len(s) > maxCacheableLength {
|
||||
return s
|
||||
}
|
||||
```
|
||||
|
||||
### 4. Concurrency-Safe Design
|
||||
|
||||
The implementation uses a combination of read and write locks for better performance:
|
||||
|
||||
```go
|
||||
// Use read lock for lookup first (less contention)
|
||||
globalCache.RLock()
|
||||
cached, exists := globalCache.strings[s]
|
||||
globalCache.RUnlock()
|
||||
|
||||
if exists {
|
||||
return cached
|
||||
}
|
||||
|
||||
// Not found with read lock, acquire write lock to add
|
||||
globalCache.Lock()
|
||||
defer globalCache.Unlock()
|
||||
```
|
||||
|
||||
### 5. Benchmark Results
|
||||
|
||||
The string interning benchmark shows significant improvements:
|
||||
|
||||
```
|
||||
BenchmarkStringIntern_Comparison/OriginalGetStringConstant-8 154,611 7,746 ns/op 0 B/op 0 allocs/op
|
||||
BenchmarkStringIntern_Comparison/GlobalIntern-8 813,786 1,492 ns/op 0 B/op 0 allocs/op
|
||||
```
|
||||
|
||||
The global string interning is about 5.2 times faster than the original method.
|
||||
|
||||
## Future Optimization Opportunities
|
||||
|
||||
1. **String Interning** - Deduplicate identical strings to further reduce memory usage
|
||||
1. **Tokenizer Pooling** - Create a pool for the OptimizedTokenizer to reduce allocations
|
||||
2. **Locale-aware Formatting** - Add optimized formatters for different locales
|
||||
3. **Custom Type Formatting** - Add specialized formatters for common custom types
|
||||
4. **Buffer Size Prediction** - Predict optimal initial buffer size based on template
|
||||
|
|
|
|||
87
STRING_INTERN_BENCHMARK_RESULTS.md
Normal file
87
STRING_INTERN_BENCHMARK_RESULTS.md
Normal file
|
|
@ -0,0 +1,87 @@
|
|||
# String Interning Optimization Benchmark Results
|
||||
|
||||
## Overview
|
||||
|
||||
This document presents the benchmark results for Phase 1 of the Zero Allocation Plan: Global String Cache Optimization.
|
||||
|
||||
## String Interning Benchmarks
|
||||
|
||||
### Individual String Interning Performance
|
||||
|
||||
| Benchmark | Operations/sec | ns/op | B/op | allocs/op |
|
||||
|-----------|--------------|-------|------|-----------|
|
||||
| BenchmarkIntern_Common | 165,962,065 | 7.092 | 0 | 0 |
|
||||
| BenchmarkIntern_Uncommon | 22,551,727 | 53.14 | 24 | 1 |
|
||||
| BenchmarkIntern_Long | 562,113,764 | 2.138 | 0 | 0 |
|
||||
|
||||
### String Interning Comparison
|
||||
|
||||
| Benchmark | Operations/sec | ns/op | B/op | allocs/op |
|
||||
|-----------|--------------|-------|------|-----------|
|
||||
| OriginalGetStringConstant | 154,611 | 7,746 | 0 | 0 |
|
||||
| GlobalIntern | 813,786 | 1,492 | 0 | 0 |
|
||||
|
||||
The global string interning is about 5.2 times faster than the original method.
|
||||
|
||||
## Tokenizer Benchmarks
|
||||
|
||||
| Benchmark | Operations/sec | ns/op | B/op | allocs/op |
|
||||
|-----------|--------------|-------|------|-----------|
|
||||
| OriginalTokenizer | 128,847 | 9,316 | 36 | 9 |
|
||||
| OptimizedTokenizer (Initial) | 119,088 | 10,209 | 11,340 | 27 |
|
||||
| OptimizedTokenizer (Pooled) | 128,768 | 9,377 | 36 | 9 |
|
||||
|
||||
## Analysis
|
||||
|
||||
1. **String Interning Efficiency:**
|
||||
- For common strings, the interning is very efficient with zero allocations
|
||||
- For uncommon strings, there's only one allocation per operation
|
||||
- For long strings (>64 bytes), we avoid interning altogether to prevent memory bloat
|
||||
|
||||
2. **Global String Cache Performance:**
|
||||
- Our new `Intern` function is 5.2 times faster than the original method
|
||||
- This is due to using a map-based lookup (O(1)) instead of linear search (O(n))
|
||||
- The global cache with fast paths for common strings dramatically improves performance
|
||||
|
||||
3. **Tokenizer Performance:**
|
||||
- Initial Implementation Challenges:
|
||||
- Despite faster string interning, the first implementation was slower
|
||||
- Initial issues: map operations overhead, higher allocations (27 vs 9), large memory usage (11,340 B/op vs 36 B/op)
|
||||
|
||||
- Pooled Implementation Benefits:
|
||||
- Implementing object pooling brought allocations back to the same level as original (9 allocs/op)
|
||||
- Memory usage reduced from 11,340 B/op to 36 B/op
|
||||
- Performance is now on par with the original implementation (9,377 ns/op vs 9,316 ns/op)
|
||||
- All with the benefits of the faster string interning underneath
|
||||
|
||||
## Next Steps
|
||||
|
||||
Based on these results, we should focus on:
|
||||
|
||||
1. **Further Optimizing String Interning:**
|
||||
- Extend the fast paths to cover more common strings
|
||||
- Investigate string partitioning to improve cache locality
|
||||
- Consider pre-loading more common HTML and template strings
|
||||
|
||||
2. **Tokenization Process Optimization:**
|
||||
- Implement specialization for different token types
|
||||
- Optimize tag detection with faster algorithms
|
||||
- Consider block tag-specific optimizations
|
||||
|
||||
3. **Proceed to Phase 2:**
|
||||
- Move forward with the "Optimized String Lookup During Tokenization" phase
|
||||
- Focus on improving tokenization algorithms now that interning is optimized
|
||||
- Implement buffer pooling for internal token handling
|
||||
|
||||
## Conclusion
|
||||
|
||||
The global string interning optimization has been successful, showing a 5.2x performance improvement in isolation. With the addition of object pooling, we've successfully maintained the memory efficiency of the original implementation while gaining the benefits of faster string interning.
|
||||
|
||||
The implementation achieves our goals for Phase 1:
|
||||
1. ✅ Creating a centralized global string cache with pre-loaded common strings
|
||||
2. ✅ Implementing mutex-protected access with fast paths
|
||||
3. ✅ Ensuring zero allocations for common strings
|
||||
4. ✅ Length-based optimization to prevent memory bloat
|
||||
5. ✅ Object pooling to avoid allocation overhead
|
||||
|
||||
The next phase will focus on improving the tokenization process itself to leverage our optimized string interning system more effectively.
|
||||
|
|
@ -1,70 +0,0 @@
|
|||
# Zero Allocation Plan Status
|
||||
|
||||
This document tracks the progress of our zero allocation optimization plan for the Twig template engine.
|
||||
|
||||
## Completed Optimizations
|
||||
|
||||
### 1. Tokenizer Optimization
|
||||
- Replaced strings.Count with custom zero-allocation countNewlines function
|
||||
- Eliminated string allocations in tokenization process
|
||||
- Improved tokenizer performance by ~10-15%
|
||||
- Documentation: See TOKENIZER_OPTIMIZATION.md
|
||||
|
||||
### 2. RenderContext Optimization
|
||||
- Created specialized pools for maps used in RenderContext
|
||||
- Enhanced object pooling for RenderContext objects
|
||||
- Eliminated allocations in context creation, cloning, and nesting
|
||||
- Improved variable lookup performance
|
||||
- Documentation: See RENDER_CONTEXT_OPTIMIZATION.md
|
||||
|
||||
### 3. Expression Evaluation Optimization
|
||||
- Enhanced object pooling for expression nodes
|
||||
- Improved array and map handling in expression evaluation
|
||||
- Optimized function and filter argument handling
|
||||
- Reduced allocations in complex expressions
|
||||
- Documentation: See EXPRESSION_OPTIMIZATION.md
|
||||
|
||||
### 4. Buffer Handling Optimization
|
||||
- Implemented specialized buffer pool for string operations
|
||||
- Added zero-allocation integer and float formatting
|
||||
- Created efficient string formatting without fmt.Sprintf
|
||||
- Optimized buffer growth strategy
|
||||
- Improved WriteString utility to reduce allocations
|
||||
- Documentation: See BUFFER_OPTIMIZATION.md
|
||||
|
||||
## Upcoming Optimizations
|
||||
|
||||
### 5. String Interning
|
||||
- Implement string deduplication system
|
||||
- Reduce memory usage for repeated strings
|
||||
- Pool common string values across templates
|
||||
|
||||
### 6. Filter Chain Optimization
|
||||
- Further optimize filter chain evaluation
|
||||
- Pool filter arguments and results
|
||||
- Specialize common filter chains
|
||||
|
||||
### 7. Template Cache Improvements
|
||||
- Enhance template caching mechanism
|
||||
- Better reuse of parsed templates
|
||||
- Pool template components
|
||||
|
||||
### 8. Attribute Access Caching
|
||||
- Implement efficient caching for attribute lookups
|
||||
- Specialized map for attribute reflection results
|
||||
- Optimize common attribute access patterns
|
||||
|
||||
## Performance Results
|
||||
|
||||
Key performance metrics after implementing the above optimizations:
|
||||
|
||||
| Optimization Area | Before | After | Improvement |
|
||||
|-------------------|--------|-------|-------------|
|
||||
| Tokenization | ~100-150 allocs/op | ~85-120 allocs/op | ~10-15% fewer allocations |
|
||||
| RenderContext Creation | ~1000-1500 B/op | 0 B/op | 100% elimination |
|
||||
| RenderContext Cloning | ~500-800 B/op | 0 B/op | 100% elimination |
|
||||
| Nested Context | ~2500-3000 B/op | 0 B/op | 100% elimination |
|
||||
| Integer Formatting | 387 ns/op | 310 ns/op | 25% faster |
|
||||
| String Formatting | 85.92 ns/op, 64 B/op | 45.10 ns/op, 16 B/op | 47% faster, 75% less memory |
|
||||
|
||||
Overall, these optimizations have significantly reduced memory allocations throughout the template rendering pipeline, resulting in better performance especially in high-concurrency scenarios where garbage collection overhead becomes significant.
|
||||
|
|
@ -8,8 +8,8 @@ Environment:
|
|||
|
||||
| Engine | Time (µs/op) | Memory Usage (KB/op) |
|
||||
|-------------|--------------|----------------------|
|
||||
| Twig | 0.20 | 0.12 |
|
||||
| Go Template | 9.31 | 1.34 |
|
||||
| Twig | 0.40 | 0.12 |
|
||||
| Go Template | 12.69 | 1.33 |
|
||||
|
||||
Twig is 0.02x faster than Go's template engine.
|
||||
Twig is 0.03x faster than Go's template engine.
|
||||
Twig uses 0.09x less memory than Go's template engine.
|
||||
|
|
|
|||
127
global_string_cache.go
Normal file
127
global_string_cache.go
Normal file
|
|
@ -0,0 +1,127 @@
|
|||
package twig
|
||||
|
||||
import (
|
||||
"sync"
|
||||
)
|
||||
|
||||
const (
|
||||
// Common HTML/Twig strings to pre-cache
|
||||
maxCacheableLength = 64 // Only cache strings shorter than this to avoid memory bloat
|
||||
|
||||
// Common HTML tags
|
||||
stringDiv = "div"
|
||||
stringSpan = "span"
|
||||
stringP = "p"
|
||||
stringA = "a"
|
||||
stringImg = "img"
|
||||
stringHref = "href"
|
||||
stringClass = "class"
|
||||
stringId = "id"
|
||||
stringStyle = "style"
|
||||
|
||||
// Common Twig syntax
|
||||
stringIf = "if"
|
||||
stringFor = "for"
|
||||
stringEnd = "end"
|
||||
stringEndif = "endif"
|
||||
stringEndfor = "endfor"
|
||||
stringElse = "else"
|
||||
stringBlock = "block"
|
||||
stringSet = "set"
|
||||
stringInclude = "include"
|
||||
stringExtends = "extends"
|
||||
stringMacro = "macro"
|
||||
|
||||
// Common operators
|
||||
stringEquals = "=="
|
||||
stringNotEquals = "!="
|
||||
stringAnd = "and"
|
||||
stringOr = "or"
|
||||
stringNot = "not"
|
||||
stringIn = "in"
|
||||
stringIs = "is"
|
||||
)
|
||||
|
||||
// GlobalStringCache provides a centralized cache for string interning
|
||||
type GlobalStringCache struct {
|
||||
sync.RWMutex
|
||||
strings map[string]string
|
||||
}
|
||||
|
||||
var (
|
||||
// Singleton instance of the global string cache
|
||||
globalCache = newGlobalStringCache()
|
||||
)
|
||||
|
||||
// newGlobalStringCache creates a new global string cache with pre-populated common strings
|
||||
func newGlobalStringCache() *GlobalStringCache {
|
||||
cache := &GlobalStringCache{
|
||||
strings: make(map[string]string, 64), // Pre-allocate capacity
|
||||
}
|
||||
|
||||
// Pre-populate with common strings
|
||||
commonStrings := []string{
|
||||
stringDiv, stringSpan, stringP, stringA, stringImg,
|
||||
stringHref, stringClass, stringId, stringStyle,
|
||||
stringIf, stringFor, stringEnd, stringEndif, stringEndfor,
|
||||
stringElse, stringBlock, stringSet, stringInclude, stringExtends,
|
||||
stringMacro, stringEquals, stringNotEquals, stringAnd,
|
||||
stringOr, stringNot, stringIn, stringIs,
|
||||
// Add empty string as well
|
||||
"",
|
||||
}
|
||||
|
||||
for _, s := range commonStrings {
|
||||
cache.strings[s] = s
|
||||
}
|
||||
|
||||
return cache
|
||||
}
|
||||
|
||||
// Intern returns an interned version of the input string
|
||||
// For strings that are already in the cache, the cached version is returned
|
||||
// Otherwise, the input string is added to the cache and returned
|
||||
func Intern(s string) string {
|
||||
// Fast path for very common strings to avoid lock contention
|
||||
switch s {
|
||||
case stringDiv, stringSpan, stringP, stringA, stringImg,
|
||||
stringIf, stringFor, stringEnd, stringEndif, stringEndfor,
|
||||
stringElse, "":
|
||||
return s
|
||||
}
|
||||
|
||||
// Don't intern strings that are too long
|
||||
if len(s) > maxCacheableLength {
|
||||
return s
|
||||
}
|
||||
|
||||
// Use read lock for lookup first (less contention)
|
||||
globalCache.RLock()
|
||||
cached, exists := globalCache.strings[s]
|
||||
globalCache.RUnlock()
|
||||
|
||||
if exists {
|
||||
return cached
|
||||
}
|
||||
|
||||
// Not found with read lock, acquire write lock to add
|
||||
globalCache.Lock()
|
||||
defer globalCache.Unlock()
|
||||
|
||||
// Check again after acquiring write lock (double-checked locking)
|
||||
if cached, exists := globalCache.strings[s]; exists {
|
||||
return cached
|
||||
}
|
||||
|
||||
// Add to cache and return
|
||||
globalCache.strings[s] = s
|
||||
return s
|
||||
}
|
||||
|
||||
// InternSlice interns all strings in a slice
|
||||
func InternSlice(slice []string) []string {
|
||||
for i, s := range slice {
|
||||
slice[i] = Intern(s)
|
||||
}
|
||||
return slice
|
||||
}
|
||||
187
global_string_cache_test.go
Normal file
187
global_string_cache_test.go
Normal file
|
|
@ -0,0 +1,187 @@
|
|||
package twig
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
"testing"
|
||||
)
|
||||
|
||||
// Test that the global string cache correctly interns strings
|
||||
func TestGlobalStringCache(t *testing.T) {
|
||||
// Test interning common strings
|
||||
commonStrings := []string{"div", "if", "for", "endif", "endfor", "else", ""}
|
||||
|
||||
for _, s := range commonStrings {
|
||||
interned := Intern(s)
|
||||
|
||||
// The interned string should be the same value
|
||||
if interned != s {
|
||||
t.Errorf("Interned string %q should equal original", s)
|
||||
}
|
||||
|
||||
// The interned string should be the same address for common strings
|
||||
if strings.Compare(interned, s) != 0 {
|
||||
t.Errorf("Interned string %q should be the same instance", s)
|
||||
}
|
||||
}
|
||||
|
||||
// Test interning the same string twice returns the same value
|
||||
s1 := "test_string"
|
||||
interned1 := Intern(s1)
|
||||
interned2 := Intern(s1)
|
||||
|
||||
// Since we're comparing strings by value, not pointers
|
||||
if interned1 != interned2 {
|
||||
t.Errorf("Interning the same string twice should return the same string value")
|
||||
}
|
||||
|
||||
// Test that long strings aren't interned (compared by value but not address)
|
||||
longString := strings.Repeat("x", maxCacheableLength+1)
|
||||
internedLong := Intern(longString)
|
||||
|
||||
if internedLong != longString {
|
||||
t.Errorf("Long string should equal original after Intern")
|
||||
}
|
||||
}
|
||||
|
||||
// Benchmark string interning for common string cases
|
||||
func BenchmarkIntern_Common(b *testing.B) {
|
||||
b.ReportAllocs()
|
||||
for i := 0; i < b.N; i++ {
|
||||
_ = Intern("div")
|
||||
_ = Intern("for")
|
||||
_ = Intern("if")
|
||||
_ = Intern("endif")
|
||||
}
|
||||
}
|
||||
|
||||
// Benchmark string interning for uncommon strings
|
||||
func BenchmarkIntern_Uncommon(b *testing.B) {
|
||||
b.ReportAllocs()
|
||||
for i := 0; i < b.N; i++ {
|
||||
s := fmt.Sprintf("uncommon_string_%d", i%100)
|
||||
_ = Intern(s)
|
||||
}
|
||||
}
|
||||
|
||||
// Benchmark string interning for long strings
|
||||
func BenchmarkIntern_Long(b *testing.B) {
|
||||
longString := strings.Repeat("x", maxCacheableLength+1)
|
||||
b.ReportAllocs()
|
||||
for i := 0; i < b.N; i++ {
|
||||
_ = Intern(longString)
|
||||
}
|
||||
}
|
||||
|
||||
// Benchmark old tokenizer vs new optimized tokenizer
|
||||
func BenchmarkTokenizer_Comparison(b *testing.B) {
|
||||
// Sample template with various elements to test tokenization
|
||||
template := `<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>{{ page_title }}</title>
|
||||
<link rel="stylesheet" href="{{ asset('styles.css') }}">
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<h1>{{ page_title }}</h1>
|
||||
|
||||
{% if user %}
|
||||
<p>Welcome back, {{ user.name }}!</p>
|
||||
|
||||
{% if user.isAdmin %}
|
||||
<div class="admin-panel">
|
||||
<h2>Admin Controls</h2>
|
||||
<ul>
|
||||
{% for item in admin_items %}
|
||||
<li>{{ item.name }} - {{ item.description }}</li>
|
||||
{% endfor %}
|
||||
</ul>
|
||||
</div>
|
||||
{% endif %}
|
||||
|
||||
<div class="user-content">
|
||||
{% block user_content %}
|
||||
<p>Default user content</p>
|
||||
{% endblock %}
|
||||
</div>
|
||||
{% else %}
|
||||
<p>Welcome, guest! Please <a href="{{ login_url }}">login</a>.</p>
|
||||
{% endif %}
|
||||
|
||||
<footer>
|
||||
<p>© {{ 'now'|date('Y') }} Example Company</p>
|
||||
</footer>
|
||||
</div>
|
||||
</body>
|
||||
</html>`
|
||||
|
||||
// Benchmark the original tokenizer
|
||||
b.Run("OriginalTokenizer", func(b *testing.B) {
|
||||
b.ReportAllocs()
|
||||
b.ResetTimer()
|
||||
|
||||
for i := 0; i < b.N; i++ {
|
||||
tokenizer := GetTokenizer(template, 0)
|
||||
tokens, _ := tokenizer.TokenizeHtmlPreserving()
|
||||
_ = tokens
|
||||
ReleaseTokenizer(tokenizer)
|
||||
}
|
||||
})
|
||||
|
||||
// Benchmark the optimized tokenizer
|
||||
b.Run("OptimizedTokenizer", func(b *testing.B) {
|
||||
b.ReportAllocs()
|
||||
b.ResetTimer()
|
||||
|
||||
for i := 0; i < b.N; i++ {
|
||||
tokenizer := NewOptimizedTokenizer()
|
||||
tokenizer.baseTokenizer.source = template
|
||||
tokenizer.baseTokenizer.position = 0
|
||||
tokenizer.baseTokenizer.line = 1
|
||||
|
||||
tokens, _ := tokenizer.TokenizeHtmlPreserving()
|
||||
_ = tokens
|
||||
|
||||
ReleaseOptimizedTokenizer(tokenizer)
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
// Benchmark string interning in the original tokenizer vs global string cache
|
||||
func BenchmarkStringIntern_Comparison(b *testing.B) {
|
||||
// Generate some test strings
|
||||
testStrings := make([]string, 100)
|
||||
for i := 0; i < 100; i++ {
|
||||
testStrings[i] = fmt.Sprintf("test_string_%d", i)
|
||||
}
|
||||
|
||||
// Also include some common strings
|
||||
commonStrings := []string{"div", "if", "for", "endif", "endfor", "else", ""}
|
||||
testStrings = append(testStrings, commonStrings...)
|
||||
|
||||
// Benchmark the original GetStringConstant method
|
||||
b.Run("OriginalGetStringConstant", func(b *testing.B) {
|
||||
tokenizer := ZeroAllocTokenizer{}
|
||||
b.ReportAllocs()
|
||||
b.ResetTimer()
|
||||
|
||||
for i := 0; i < b.N; i++ {
|
||||
for _, s := range testStrings {
|
||||
_ = tokenizer.GetStringConstant(s)
|
||||
}
|
||||
}
|
||||
})
|
||||
|
||||
// Benchmark the new global cache Intern method
|
||||
b.Run("GlobalIntern", func(b *testing.B) {
|
||||
b.ReportAllocs()
|
||||
b.ResetTimer()
|
||||
|
||||
for i := 0; i < b.N; i++ {
|
||||
for _, s := range testStrings {
|
||||
_ = Intern(s)
|
||||
}
|
||||
}
|
||||
})
|
||||
}
|
||||
|
|
@ -1,735 +0,0 @@
|
|||
package twig
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// optimizedHtmlPreservingTokenize is an optimized version of htmlPreservingTokenize
|
||||
// that reduces memory allocations by reusing token objects and slices
|
||||
func (p *Parser) optimizedHtmlPreservingTokenize() ([]Token, error) {
|
||||
// Pre-allocate tokens with estimated capacity based on source length
|
||||
estimatedTokenCount := len(p.source) / 20 // Rough estimate: one token per 20 chars
|
||||
tokenSlice := GetPooledTokenSlice(estimatedTokenCount)
|
||||
|
||||
// Ensure the token slice is released even if an error occurs
|
||||
defer tokenSlice.Release()
|
||||
|
||||
var currentPosition int
|
||||
line := 1
|
||||
|
||||
for currentPosition < len(p.source) {
|
||||
// Find the next twig tag start
|
||||
nextTagPos := -1
|
||||
tagType := -1
|
||||
var matchedPos struct {
|
||||
pos int
|
||||
pattern string
|
||||
ttype int
|
||||
length int
|
||||
}
|
||||
|
||||
// Use a single substring for all pattern searches to reduce allocations
|
||||
remainingSource := p.source[currentPosition:]
|
||||
|
||||
// Check for all possible tag starts, including whitespace control variants
|
||||
positions := []struct {
|
||||
pos int
|
||||
pattern string
|
||||
ttype int
|
||||
length int
|
||||
}{
|
||||
{strings.Index(remainingSource, "{{-"), "{{-", TOKEN_VAR_START_TRIM, 3},
|
||||
{strings.Index(remainingSource, "{{"), "{{", TOKEN_VAR_START, 2},
|
||||
{strings.Index(remainingSource, "{%-"), "{%-", TOKEN_BLOCK_START_TRIM, 3},
|
||||
{strings.Index(remainingSource, "{%"), "{%", TOKEN_BLOCK_START, 2},
|
||||
{strings.Index(remainingSource, "{#"), "{#", TOKEN_COMMENT_START, 2},
|
||||
}
|
||||
|
||||
// Find the closest tag
|
||||
for _, pos := range positions {
|
||||
if pos.pos != -1 {
|
||||
adjustedPos := currentPosition + pos.pos
|
||||
if nextTagPos == -1 || adjustedPos < nextTagPos {
|
||||
nextTagPos = adjustedPos
|
||||
tagType = pos.ttype
|
||||
matchedPos = pos
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Check if the tag is escaped with a backslash
|
||||
if nextTagPos != -1 && nextTagPos > 0 && p.source[nextTagPos-1] == '\\' {
|
||||
// This tag is escaped with a backslash, treat it as literal text
|
||||
// Add text up to the backslash (if any)
|
||||
if nextTagPos-1 > currentPosition {
|
||||
preText := p.source[currentPosition : nextTagPos-1]
|
||||
tokenSlice.AppendToken(TOKEN_TEXT, preText, line)
|
||||
line += countNewlines(preText)
|
||||
}
|
||||
|
||||
// Add the tag itself as literal text (without the backslash)
|
||||
tokenSlice.AppendToken(TOKEN_TEXT, matchedPos.pattern, line)
|
||||
|
||||
// Move past the tag
|
||||
currentPosition = nextTagPos + matchedPos.length
|
||||
continue
|
||||
}
|
||||
|
||||
if nextTagPos == -1 {
|
||||
// No more tags found, add the rest as TEXT
|
||||
content := p.source[currentPosition:]
|
||||
if len(content) > 0 {
|
||||
line += countNewlines(content)
|
||||
tokenSlice.AppendToken(TOKEN_TEXT, content, line)
|
||||
}
|
||||
break
|
||||
}
|
||||
|
||||
// Add the text before the tag (HTML content)
|
||||
if nextTagPos > currentPosition {
|
||||
content := p.source[currentPosition:nextTagPos]
|
||||
line += countNewlines(content)
|
||||
tokenSlice.AppendToken(TOKEN_TEXT, content, line)
|
||||
}
|
||||
|
||||
// Add the tag start token
|
||||
tokenSlice.AppendToken(tagType, "", line)
|
||||
|
||||
// Determine tag length and move past the opening
|
||||
tagLength := 2 // Default for "{{", "{%", "{#"
|
||||
if tagType == TOKEN_VAR_START_TRIM || tagType == TOKEN_BLOCK_START_TRIM {
|
||||
tagLength = 3 // For "{{-" or "{%-"
|
||||
}
|
||||
currentPosition = nextTagPos + tagLength
|
||||
|
||||
// Find the matching end tag
|
||||
var endTag string
|
||||
var endTagType int
|
||||
var endTagLength int
|
||||
|
||||
if tagType == TOKEN_VAR_START || tagType == TOKEN_VAR_START_TRIM {
|
||||
// For variable tags, look for "}}" or "-}}"
|
||||
endPos1 := strings.Index(p.source[currentPosition:], "}}")
|
||||
endPos2 := strings.Index(p.source[currentPosition:], "-}}")
|
||||
|
||||
if endPos1 != -1 && (endPos2 == -1 || endPos1 < endPos2) {
|
||||
endTag = "}}"
|
||||
endTagType = TOKEN_VAR_END
|
||||
endTagLength = 2
|
||||
} else if endPos2 != -1 {
|
||||
endTag = "-}}"
|
||||
endTagType = TOKEN_VAR_END_TRIM
|
||||
endTagLength = 3
|
||||
} else {
|
||||
return nil, fmt.Errorf("unclosed variable tag starting at line %d", line)
|
||||
}
|
||||
} else if tagType == TOKEN_BLOCK_START || tagType == TOKEN_BLOCK_START_TRIM {
|
||||
// For block tags, look for "%}" or "-%}"
|
||||
endPos1 := strings.Index(p.source[currentPosition:], "%}")
|
||||
endPos2 := strings.Index(p.source[currentPosition:], "-%}")
|
||||
|
||||
if endPos1 != -1 && (endPos2 == -1 || endPos1 < endPos2) {
|
||||
endTag = "%}"
|
||||
endTagType = TOKEN_BLOCK_END
|
||||
endTagLength = 2
|
||||
} else if endPos2 != -1 {
|
||||
endTag = "-%}"
|
||||
endTagType = TOKEN_BLOCK_END_TRIM
|
||||
endTagLength = 3
|
||||
} else {
|
||||
return nil, fmt.Errorf("unclosed block tag starting at line %d", line)
|
||||
}
|
||||
} else if tagType == TOKEN_COMMENT_START {
|
||||
// For comment tags, look for "#}"
|
||||
endPos := strings.Index(p.source[currentPosition:], "#}")
|
||||
if endPos == -1 {
|
||||
return nil, fmt.Errorf("unclosed comment starting at line %d", line)
|
||||
}
|
||||
endTag = "#}"
|
||||
endTagType = TOKEN_COMMENT_END
|
||||
endTagLength = 2
|
||||
}
|
||||
|
||||
// Find the position of the end tag
|
||||
endPos := strings.Index(p.source[currentPosition:], endTag)
|
||||
if endPos == -1 {
|
||||
return nil, fmt.Errorf("unclosed tag starting at line %d", line)
|
||||
}
|
||||
|
||||
// Get the content between the tags
|
||||
tagContent := p.source[currentPosition : currentPosition+endPos]
|
||||
line += countNewlines(tagContent) // Update line count
|
||||
|
||||
// Process the content between the tags based on tag type
|
||||
if tagType == TOKEN_COMMENT_START {
|
||||
// For comments, just store the content as a TEXT token
|
||||
if len(tagContent) > 0 {
|
||||
tokenSlice.AppendToken(TOKEN_TEXT, tagContent, line)
|
||||
}
|
||||
} else {
|
||||
// For variable and block tags, tokenize the content properly
|
||||
// Trim whitespace from the tag content
|
||||
tagContent = strings.TrimSpace(tagContent)
|
||||
|
||||
if tagType == TOKEN_BLOCK_START || tagType == TOKEN_BLOCK_START_TRIM {
|
||||
// Process block tags like if, for, etc.
|
||||
// First, extract the tag name
|
||||
parts := strings.SplitN(tagContent, " ", 2)
|
||||
if len(parts) > 0 {
|
||||
blockName := parts[0]
|
||||
tokenSlice.AppendToken(TOKEN_NAME, blockName, line)
|
||||
|
||||
// Different handling based on block type
|
||||
if blockName == "if" || blockName == "elseif" {
|
||||
// For if/elseif blocks, tokenize the condition
|
||||
if len(parts) > 1 {
|
||||
condition := strings.TrimSpace(parts[1])
|
||||
// Tokenize the condition properly
|
||||
p.optimizedTokenizeExpression(condition, tokenSlice, line)
|
||||
}
|
||||
} else if blockName == "for" {
|
||||
// For for loops, tokenize iterator variables and collection
|
||||
if len(parts) > 1 {
|
||||
forExpr := strings.TrimSpace(parts[1])
|
||||
// Check for proper "in" keyword
|
||||
inPos := strings.Index(strings.ToLower(forExpr), " in ")
|
||||
if inPos != -1 {
|
||||
// Extract iterators and collection
|
||||
iterators := strings.TrimSpace(forExpr[:inPos])
|
||||
collection := strings.TrimSpace(forExpr[inPos+4:])
|
||||
|
||||
// Handle key, value iterators (e.g., "key, value in collection")
|
||||
if strings.Contains(iterators, ",") {
|
||||
iterParts := strings.SplitN(iterators, ",", 2)
|
||||
if len(iterParts) == 2 {
|
||||
keyVar := strings.TrimSpace(iterParts[0])
|
||||
valueVar := strings.TrimSpace(iterParts[1])
|
||||
|
||||
// Add tokens for key and value variables
|
||||
tokenSlice.AppendToken(TOKEN_NAME, keyVar, line)
|
||||
tokenSlice.AppendToken(TOKEN_PUNCTUATION, ",", line)
|
||||
tokenSlice.AppendToken(TOKEN_NAME, valueVar, line)
|
||||
}
|
||||
} else {
|
||||
// Single iterator variable
|
||||
tokenSlice.AppendToken(TOKEN_NAME, iterators, line)
|
||||
}
|
||||
|
||||
// Add "in" keyword
|
||||
tokenSlice.AppendToken(TOKEN_NAME, "in", line)
|
||||
|
||||
// Check if collection is a function call (contains ( and ))
|
||||
if strings.Contains(collection, "(") && strings.Contains(collection, ")") {
|
||||
// Tokenize the collection as a complex expression
|
||||
p.optimizedTokenizeExpression(collection, tokenSlice, line)
|
||||
} else {
|
||||
// Add collection as a simple variable
|
||||
tokenSlice.AppendToken(TOKEN_NAME, collection, line)
|
||||
}
|
||||
} else {
|
||||
// Fallback if "in" keyword not found
|
||||
tokenSlice.AppendToken(TOKEN_NAME, forExpr, line)
|
||||
}
|
||||
}
|
||||
} else if blockName == "do" {
|
||||
// Special handling for do tag with assignments and expressions
|
||||
if len(parts) > 1 {
|
||||
doExpr := strings.TrimSpace(parts[1])
|
||||
|
||||
// Check if it's an assignment (contains =)
|
||||
assignPos := strings.Index(doExpr, "=")
|
||||
if assignPos > 0 && !strings.Contains(doExpr[:assignPos], "==") {
|
||||
// It's an assignment
|
||||
varName := strings.TrimSpace(doExpr[:assignPos])
|
||||
valueExpr := strings.TrimSpace(doExpr[assignPos+1:])
|
||||
|
||||
// Add the variable name
|
||||
tokenSlice.AppendToken(TOKEN_NAME, varName, line)
|
||||
|
||||
// Add the equals sign
|
||||
tokenSlice.AppendToken(TOKEN_OPERATOR, "=", line)
|
||||
|
||||
// Tokenize the expression on the right side
|
||||
p.optimizedTokenizeExpression(valueExpr, tokenSlice, line)
|
||||
} else {
|
||||
// It's just an expression, tokenize it
|
||||
p.optimizedTokenizeExpression(doExpr, tokenSlice, line)
|
||||
}
|
||||
}
|
||||
} else if blockName == "include" {
|
||||
// Special handling for include tag with quoted template names
|
||||
if len(parts) > 1 {
|
||||
includeExpr := strings.TrimSpace(parts[1])
|
||||
|
||||
// First check if we have a 'with' keyword which separates template name from params
|
||||
withPos := strings.Index(strings.ToLower(includeExpr), " with ")
|
||||
|
||||
if withPos > 0 {
|
||||
// Split the include expression into template name and parameters
|
||||
templatePart := strings.TrimSpace(includeExpr[:withPos])
|
||||
paramsPart := strings.TrimSpace(includeExpr[withPos+6:]) // +6 to skip " with "
|
||||
|
||||
// Handle quoted template names
|
||||
if (strings.HasPrefix(templatePart, "\"") && strings.HasSuffix(templatePart, "\"")) ||
|
||||
(strings.HasPrefix(templatePart, "'") && strings.HasSuffix(templatePart, "'")) {
|
||||
// Extract the template name without quotes
|
||||
templateName := templatePart[1 : len(templatePart)-1]
|
||||
// Add as a string token
|
||||
tokenSlice.AppendToken(TOKEN_STRING, templateName, line)
|
||||
} else {
|
||||
// Unquoted name, add as name token
|
||||
tokenSlice.AppendToken(TOKEN_NAME, templatePart, line)
|
||||
}
|
||||
|
||||
// Add "with" keyword
|
||||
tokenSlice.AppendToken(TOKEN_NAME, "with", line)
|
||||
|
||||
// Add opening brace for the parameters
|
||||
tokenSlice.AppendToken(TOKEN_PUNCTUATION, "{", line)
|
||||
|
||||
// For parameters that might include nested objects, we need a different approach
|
||||
// Tokenize the parameter string, preserving nested structures
|
||||
optimizedTokenizeComplexObject(paramsPart, tokenSlice, line)
|
||||
|
||||
// Add closing brace
|
||||
tokenSlice.AppendToken(TOKEN_PUNCTUATION, "}", line)
|
||||
} else {
|
||||
// No 'with' keyword, just a template name
|
||||
if (strings.HasPrefix(includeExpr, "\"") && strings.HasSuffix(includeExpr, "\"")) ||
|
||||
(strings.HasPrefix(includeExpr, "'") && strings.HasSuffix(includeExpr, "'")) {
|
||||
// Extract template name without quotes
|
||||
templateName := includeExpr[1 : len(includeExpr)-1]
|
||||
// Add as a string token
|
||||
tokenSlice.AppendToken(TOKEN_STRING, templateName, line)
|
||||
} else {
|
||||
// Not quoted, add as name token
|
||||
tokenSlice.AppendToken(TOKEN_NAME, includeExpr, line)
|
||||
}
|
||||
}
|
||||
}
|
||||
} else if blockName == "extends" {
|
||||
// Special handling for extends tag with quoted template names
|
||||
if len(parts) > 1 {
|
||||
extendsExpr := strings.TrimSpace(parts[1])
|
||||
|
||||
// Handle quoted template names
|
||||
if (strings.HasPrefix(extendsExpr, "\"") && strings.HasSuffix(extendsExpr, "\"")) ||
|
||||
(strings.HasPrefix(extendsExpr, "'") && strings.HasSuffix(extendsExpr, "'")) {
|
||||
// Extract the template name without quotes
|
||||
templateName := extendsExpr[1 : len(extendsExpr)-1]
|
||||
// Add as a string token
|
||||
tokenSlice.AppendToken(TOKEN_STRING, templateName, line)
|
||||
} else {
|
||||
// Not quoted, tokenize as a normal expression
|
||||
p.optimizedTokenizeExpression(extendsExpr, tokenSlice, line)
|
||||
}
|
||||
}
|
||||
} else if blockName == "set" {
|
||||
// Special handling for set tag to properly tokenize variable assignments
|
||||
if len(parts) > 1 {
|
||||
setExpr := strings.TrimSpace(parts[1])
|
||||
|
||||
// Check for the assignment operator
|
||||
assignPos := strings.Index(setExpr, "=")
|
||||
|
||||
if assignPos != -1 {
|
||||
// Split into variable name and value
|
||||
varName := strings.TrimSpace(setExpr[:assignPos])
|
||||
value := strings.TrimSpace(setExpr[assignPos+1:])
|
||||
|
||||
// Add the variable name token
|
||||
tokenSlice.AppendToken(TOKEN_NAME, varName, line)
|
||||
|
||||
// Add the assignment operator
|
||||
tokenSlice.AppendToken(TOKEN_OPERATOR, "=", line)
|
||||
|
||||
// Tokenize the value expression
|
||||
p.optimizedTokenizeExpression(value, tokenSlice, line)
|
||||
} else {
|
||||
// Handle case without assignment (e.g., {% set var %})
|
||||
tokenSlice.AppendToken(TOKEN_NAME, setExpr, line)
|
||||
}
|
||||
}
|
||||
} else {
|
||||
// For other block types, just add parameters as NAME tokens
|
||||
if len(parts) > 1 {
|
||||
tokenSlice.AppendToken(TOKEN_NAME, parts[1], line)
|
||||
}
|
||||
}
|
||||
}
|
||||
} else {
|
||||
// For variable tags, tokenize the expression
|
||||
if len(tagContent) > 0 {
|
||||
// If it's a simple variable name, add it directly
|
||||
if !strings.ContainsAny(tagContent, ".|[](){}\"',+-*/=!<>%&^~") {
|
||||
tokenSlice.AppendToken(TOKEN_NAME, tagContent, line)
|
||||
} else {
|
||||
// For complex expressions, tokenize properly
|
||||
p.optimizedTokenizeExpression(tagContent, tokenSlice, line)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Add the end tag token
|
||||
tokenSlice.AppendToken(endTagType, "", line)
|
||||
|
||||
// Move past the end tag
|
||||
currentPosition = currentPosition + endPos + endTagLength
|
||||
}
|
||||
|
||||
// Add EOF token
|
||||
tokenSlice.AppendToken(TOKEN_EOF, "", line)
|
||||
|
||||
// Finalize and return the token slice
|
||||
return tokenSlice.Finalize(), nil
|
||||
}
|
||||
|
||||
// optimizedTokenizeExpression handles tokenizing expressions inside Twig tags with reduced allocations
|
||||
func (p *Parser) optimizedTokenizeExpression(expr string, tokens *PooledTokenSlice, line int) {
|
||||
var inString bool
|
||||
var stringDelimiter byte
|
||||
var stringStart int // Position where string content starts
|
||||
|
||||
for i := 0; i < len(expr); i++ {
|
||||
c := expr[i]
|
||||
|
||||
// Handle string literals with quotes
|
||||
if (c == '"' || c == '\'') && (i == 0 || expr[i-1] != '\\') {
|
||||
if inString && c == stringDelimiter {
|
||||
// End of string
|
||||
inString = false
|
||||
// Add the string token
|
||||
tokens.AppendToken(TOKEN_STRING, expr[stringStart:i], line)
|
||||
} else if !inString {
|
||||
// Start of string
|
||||
inString = true
|
||||
stringDelimiter = c
|
||||
// Remember the start position (for string content)
|
||||
stringStart = i + 1
|
||||
} else {
|
||||
// Quote inside a string with different delimiter
|
||||
// Skip
|
||||
}
|
||||
continue
|
||||
}
|
||||
|
||||
// If we're inside a string, just skip this character
|
||||
if inString {
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle operators (including two-character operators)
|
||||
if isOperator(c) {
|
||||
// Check for two-character operators
|
||||
if i+1 < len(expr) {
|
||||
nextChar := expr[i+1]
|
||||
|
||||
// Direct comparison for common two-char operators
|
||||
if (c == '=' && nextChar == '=') ||
|
||||
(c == '!' && nextChar == '=') ||
|
||||
(c == '>' && nextChar == '=') ||
|
||||
(c == '<' && nextChar == '=') ||
|
||||
(c == '&' && nextChar == '&') ||
|
||||
(c == '|' && nextChar == '|') ||
|
||||
(c == '?' && nextChar == '?') {
|
||||
|
||||
// Add the two-character operator token
|
||||
tokens.AppendToken(TOKEN_OPERATOR, string([]byte{c, nextChar}), line)
|
||||
i++ // Skip the next character
|
||||
continue
|
||||
}
|
||||
}
|
||||
|
||||
// Add single-character operator
|
||||
tokens.AppendToken(TOKEN_OPERATOR, string([]byte{c}), line)
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle punctuation
|
||||
if isPunctuation(c) {
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, string([]byte{c}), line)
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle whitespace - skip it
|
||||
if isWhitespace(c) {
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle identifiers and keywords
|
||||
if (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z') || c == '_' {
|
||||
// Start of an identifier
|
||||
start := i
|
||||
|
||||
// Find the end of the identifier
|
||||
for i++; i < len(expr) && ((expr[i] >= 'a' && expr[i] <= 'z') ||
|
||||
(expr[i] >= 'A' && expr[i] <= 'Z') ||
|
||||
(expr[i] >= '0' && expr[i] <= '9') ||
|
||||
expr[i] == '_'); i++ {
|
||||
}
|
||||
|
||||
// Extract the identifier
|
||||
identifier := expr[start:i]
|
||||
i-- // Adjust for the loop increment
|
||||
|
||||
// Add the token based on the identifier
|
||||
if identifier == "true" || identifier == "false" || identifier == "null" {
|
||||
tokens.AppendToken(TOKEN_NAME, identifier, line)
|
||||
} else {
|
||||
tokens.AppendToken(TOKEN_NAME, identifier, line)
|
||||
}
|
||||
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle numbers
|
||||
if isDigit(c) || (c == '-' && i+1 < len(expr) && isDigit(expr[i+1])) {
|
||||
start := i
|
||||
|
||||
// Skip the negative sign if present
|
||||
if c == '-' {
|
||||
i++
|
||||
}
|
||||
|
||||
// Find the end of the number
|
||||
for i++; i < len(expr) && isDigit(expr[i]); i++ {
|
||||
}
|
||||
|
||||
// Check for decimal point
|
||||
if i < len(expr) && expr[i] == '.' {
|
||||
i++
|
||||
// Find the end of the decimal part
|
||||
for ; i < len(expr) && isDigit(expr[i]); i++ {
|
||||
}
|
||||
}
|
||||
|
||||
// Extract the number
|
||||
number := expr[start:i]
|
||||
i-- // Adjust for the loop increment
|
||||
|
||||
// Add the number token
|
||||
tokens.AppendToken(TOKEN_NUMBER, number, line)
|
||||
continue
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// optimizedTokenizeComplexObject parses and tokenizes a complex object with reduced allocations
|
||||
func optimizedTokenizeComplexObject(objStr string, tokens *PooledTokenSlice, line int) {
|
||||
// First strip outer braces if present
|
||||
objStr = strings.TrimSpace(objStr)
|
||||
if strings.HasPrefix(objStr, "{") && strings.HasSuffix(objStr, "}") {
|
||||
objStr = strings.TrimSpace(objStr[1 : len(objStr)-1])
|
||||
}
|
||||
|
||||
// Tokenize the object contents
|
||||
optimizedTokenizeObjectContents(objStr, tokens, line)
|
||||
}
|
||||
|
||||
// optimizedTokenizeObjectContents parses key-value pairs with reduced allocations
|
||||
func optimizedTokenizeObjectContents(content string, tokens *PooledTokenSlice, line int) {
|
||||
// State tracking
|
||||
inSingleQuote := false
|
||||
inDoubleQuote := false
|
||||
inObject := 0 // Nesting level for objects
|
||||
inArray := 0 // Nesting level for arrays
|
||||
|
||||
start := 0
|
||||
colonPos := -1
|
||||
|
||||
for i := 0; i <= len(content); i++ {
|
||||
// At the end of the string or at a comma at the top level
|
||||
atEnd := i == len(content)
|
||||
isComma := !atEnd && content[i] == ','
|
||||
|
||||
if (isComma || atEnd) && inObject == 0 && inArray == 0 && !inSingleQuote && !inDoubleQuote {
|
||||
// We've found the end of a key-value pair
|
||||
if colonPos != -1 {
|
||||
// Extract the key and value
|
||||
keyStr := strings.TrimSpace(content[start:colonPos])
|
||||
valueStr := strings.TrimSpace(content[colonPos+1 : i])
|
||||
|
||||
// Process the key
|
||||
if (len(keyStr) >= 2 && keyStr[0] == '\'' && keyStr[len(keyStr)-1] == '\'') ||
|
||||
(len(keyStr) >= 2 && keyStr[0] == '"' && keyStr[len(keyStr)-1] == '"') {
|
||||
// Quoted key - add as a string token
|
||||
tokens.AppendToken(TOKEN_STRING, keyStr[1:len(keyStr)-1], line)
|
||||
} else {
|
||||
// Unquoted key
|
||||
tokens.AppendToken(TOKEN_NAME, keyStr, line)
|
||||
}
|
||||
|
||||
// Add colon separator
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, ":", line)
|
||||
|
||||
// Process the value based on type
|
||||
if len(valueStr) >= 2 && valueStr[0] == '{' && valueStr[len(valueStr)-1] == '}' {
|
||||
// Nested object
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, "{", line)
|
||||
optimizedTokenizeObjectContents(valueStr[1:len(valueStr)-1], tokens, line)
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, "}", line)
|
||||
} else if len(valueStr) >= 2 && valueStr[0] == '[' && valueStr[len(valueStr)-1] == ']' {
|
||||
// Array
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, "[", line)
|
||||
optimizedTokenizeArrayElements(valueStr[1:len(valueStr)-1], tokens, line)
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, "]", line)
|
||||
} else if (len(valueStr) >= 2 && valueStr[0] == '\'' && valueStr[len(valueStr)-1] == '\'') ||
|
||||
(len(valueStr) >= 2 && valueStr[0] == '"' && valueStr[len(valueStr)-1] == '"') {
|
||||
// String literal
|
||||
tokens.AppendToken(TOKEN_STRING, valueStr[1:len(valueStr)-1], line)
|
||||
} else if isNumericValue(valueStr) {
|
||||
// Numeric value
|
||||
tokens.AppendToken(TOKEN_NUMBER, valueStr, line)
|
||||
} else if valueStr == "true" || valueStr == "false" {
|
||||
// Boolean literal
|
||||
tokens.AppendToken(TOKEN_NAME, valueStr, line)
|
||||
} else if valueStr == "null" || valueStr == "nil" {
|
||||
// Null/nil literal
|
||||
tokens.AppendToken(TOKEN_NAME, valueStr, line)
|
||||
} else {
|
||||
// Variable or other value
|
||||
tokens.AppendToken(TOKEN_NAME, valueStr, line)
|
||||
}
|
||||
|
||||
// Add comma if needed
|
||||
if isComma && i < len(content)-1 {
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, ",", line)
|
||||
}
|
||||
|
||||
// Reset state for next key-value pair
|
||||
start = i + 1
|
||||
colonPos = -1
|
||||
}
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle quotes and nested structures
|
||||
if i < len(content) {
|
||||
c := content[i]
|
||||
|
||||
// Handle quote characters
|
||||
if c == '\'' && (i == 0 || content[i-1] != '\\') {
|
||||
inSingleQuote = !inSingleQuote
|
||||
} else if c == '"' && (i == 0 || content[i-1] != '\\') {
|
||||
inDoubleQuote = !inDoubleQuote
|
||||
}
|
||||
|
||||
// Skip everything inside quotes
|
||||
if inSingleQuote || inDoubleQuote {
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle object and array nesting
|
||||
if c == '{' {
|
||||
inObject++
|
||||
} else if c == '}' {
|
||||
inObject--
|
||||
} else if c == '[' {
|
||||
inArray++
|
||||
} else if c == ']' {
|
||||
inArray--
|
||||
}
|
||||
|
||||
// Find the colon separator if we're not in a nested structure
|
||||
if c == ':' && inObject == 0 && inArray == 0 && colonPos == -1 {
|
||||
colonPos = i
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// optimizedTokenizeArrayElements parses and tokenizes array elements with reduced allocations
|
||||
func optimizedTokenizeArrayElements(arrStr string, tokens *PooledTokenSlice, line int) {
|
||||
// State tracking
|
||||
inSingleQuote := false
|
||||
inDoubleQuote := false
|
||||
inObject := 0
|
||||
inArray := 0
|
||||
|
||||
// Track the start position of each element
|
||||
elemStart := 0
|
||||
|
||||
for i := 0; i <= len(arrStr); i++ {
|
||||
// At the end of the string or at a comma at the top level
|
||||
atEnd := i == len(arrStr)
|
||||
isComma := !atEnd && arrStr[i] == ','
|
||||
|
||||
// Process element when we reach a comma or the end
|
||||
if (isComma || atEnd) && inObject == 0 && inArray == 0 && !inSingleQuote && !inDoubleQuote {
|
||||
// Extract the element
|
||||
if i > elemStart {
|
||||
element := strings.TrimSpace(arrStr[elemStart:i])
|
||||
|
||||
// Process the element based on its type
|
||||
if len(element) >= 2 {
|
||||
if element[0] == '{' && element[len(element)-1] == '}' {
|
||||
// Nested object
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, "{", line)
|
||||
optimizedTokenizeObjectContents(element[1:len(element)-1], tokens, line)
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, "}", line)
|
||||
} else if element[0] == '[' && element[len(element)-1] == ']' {
|
||||
// Nested array
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, "[", line)
|
||||
optimizedTokenizeArrayElements(element[1:len(element)-1], tokens, line)
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, "]", line)
|
||||
} else if (element[0] == '\'' && element[len(element)-1] == '\'') ||
|
||||
(element[0] == '"' && element[len(element)-1] == '"') {
|
||||
// String literal
|
||||
tokens.AppendToken(TOKEN_STRING, element[1:len(element)-1], line)
|
||||
} else if isNumericValue(element) {
|
||||
// Numeric value
|
||||
tokens.AppendToken(TOKEN_NUMBER, element, line)
|
||||
} else if element == "true" || element == "false" {
|
||||
// Boolean literal
|
||||
tokens.AppendToken(TOKEN_NAME, element, line)
|
||||
} else if element == "null" || element == "nil" {
|
||||
// Null/nil literal
|
||||
tokens.AppendToken(TOKEN_NAME, element, line)
|
||||
} else {
|
||||
// Variable or other value
|
||||
tokens.AppendToken(TOKEN_NAME, element, line)
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Add comma if needed
|
||||
if isComma && i < len(arrStr)-1 {
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, ",", line)
|
||||
}
|
||||
|
||||
// Move to next element
|
||||
elemStart = i + 1
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle quotes and nested structures
|
||||
if !atEnd {
|
||||
c := arrStr[i]
|
||||
|
||||
// Handle quote characters
|
||||
if c == '\'' && (i == 0 || arrStr[i-1] != '\\') {
|
||||
inSingleQuote = !inSingleQuote
|
||||
} else if c == '"' && (i == 0 || arrStr[i-1] != '\\') {
|
||||
inDoubleQuote = !inDoubleQuote
|
||||
}
|
||||
|
||||
// Skip everything inside quotes
|
||||
if inSingleQuote || inDoubleQuote {
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle nesting
|
||||
if c == '{' {
|
||||
inObject++
|
||||
} else if c == '}' {
|
||||
inObject--
|
||||
} else if c == '[' {
|
||||
inArray++
|
||||
} else if c == ']' {
|
||||
inArray--
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
124
optimized_tokenizer.go
Normal file
124
optimized_tokenizer.go
Normal file
|
|
@ -0,0 +1,124 @@
|
|||
package twig
|
||||
|
||||
import (
|
||||
"strings"
|
||||
"sync"
|
||||
)
|
||||
|
||||
// OptimizedTokenizer implements a tokenizer that uses the global string cache
|
||||
// for zero-allocation string interning
|
||||
type OptimizedTokenizer struct {
|
||||
// Use the underlying tokenizer methods but intern strings
|
||||
baseTokenizer ZeroAllocTokenizer
|
||||
// Local cache of whether a string is a tag name
|
||||
tagCache map[string]bool
|
||||
}
|
||||
|
||||
// optimizedTokenizerPool is a sync.Pool for OptimizedTokenizer objects
|
||||
var optimizedTokenizerPool = sync.Pool{
|
||||
New: func() interface{} {
|
||||
return &OptimizedTokenizer{
|
||||
tagCache: make(map[string]bool, 32), // Pre-allocate with reasonable capacity
|
||||
}
|
||||
},
|
||||
}
|
||||
|
||||
// NewOptimizedTokenizer gets an OptimizedTokenizer from the pool
|
||||
func NewOptimizedTokenizer() *OptimizedTokenizer {
|
||||
return optimizedTokenizerPool.Get().(*OptimizedTokenizer)
|
||||
}
|
||||
|
||||
// ReleaseOptimizedTokenizer returns an OptimizedTokenizer to the pool
|
||||
func ReleaseOptimizedTokenizer(t *OptimizedTokenizer) {
|
||||
// Clear map but preserve capacity
|
||||
for k := range t.tagCache {
|
||||
delete(t.tagCache, k)
|
||||
}
|
||||
|
||||
// Return to pool
|
||||
optimizedTokenizerPool.Put(t)
|
||||
}
|
||||
|
||||
// TokenizeHtmlPreserving tokenizes HTML, preserving its structure
|
||||
func (t *OptimizedTokenizer) TokenizeHtmlPreserving() ([]Token, error) {
|
||||
// Use the base tokenizer for complex operations
|
||||
tokens, err := t.baseTokenizer.TokenizeHtmlPreserving()
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
// Optimize token strings by interning
|
||||
for i := range tokens {
|
||||
// Intern the value field of each token
|
||||
if tokens[i].Value != "" {
|
||||
tokens[i].Value = Intern(tokens[i].Value)
|
||||
}
|
||||
|
||||
// For tag names, intern them as well
|
||||
if tokens[i].Type == TOKEN_BLOCK_START || tokens[i].Type == TOKEN_BLOCK_START_TRIM ||
|
||||
tokens[i].Type == TOKEN_VAR_START || tokens[i].Type == TOKEN_VAR_START_TRIM {
|
||||
// Skip processing as these tokens don't have values
|
||||
continue
|
||||
}
|
||||
|
||||
// Process tag names - these will be TOKEN_NAME after a block start token
|
||||
if i > 0 && tokens[i].Type == TOKEN_NAME &&
|
||||
(tokens[i-1].Type == TOKEN_BLOCK_START || tokens[i-1].Type == TOKEN_BLOCK_START_TRIM) {
|
||||
// Intern the tag name
|
||||
tokens[i].Value = Intern(tokens[i].Value)
|
||||
|
||||
// Cache whether this is a tag
|
||||
t.tagCache[tokens[i].Value] = true
|
||||
}
|
||||
}
|
||||
|
||||
return tokens, nil
|
||||
}
|
||||
|
||||
// TokenizeExpression tokenizes a Twig expression
|
||||
func (t *OptimizedTokenizer) TokenizeExpression(expression string) []Token {
|
||||
// Use the base tokenizer for complex operations
|
||||
tokens := t.baseTokenizer.TokenizeExpression(expression)
|
||||
|
||||
// Optimize token strings by interning
|
||||
for i := range tokens {
|
||||
if tokens[i].Value != "" {
|
||||
tokens[i].Value = Intern(tokens[i].Value)
|
||||
}
|
||||
}
|
||||
|
||||
return tokens
|
||||
}
|
||||
|
||||
// ApplyWhitespaceControl applies whitespace control for trimming tokens
|
||||
func (t *OptimizedTokenizer) ApplyWhitespaceControl() {
|
||||
t.baseTokenizer.ApplyWhitespaceControl()
|
||||
}
|
||||
|
||||
// Helper to extract tag name from a token value
|
||||
func extractTagName(value string) string {
|
||||
value = strings.TrimSpace(value)
|
||||
space := strings.IndexByte(value, ' ')
|
||||
if space >= 0 {
|
||||
return value[:space]
|
||||
}
|
||||
return value
|
||||
}
|
||||
|
||||
// IsTag checks if a string is a known tag name (cached)
|
||||
func (t *OptimizedTokenizer) IsTag(name string) bool {
|
||||
// Fast path for common tags
|
||||
switch name {
|
||||
case stringIf, stringFor, stringEnd, stringEndif, stringEndfor,
|
||||
stringElse, stringBlock, stringSet, stringInclude, stringExtends:
|
||||
return true
|
||||
}
|
||||
|
||||
// Check the local cache
|
||||
if isTag, exists := t.tagCache[name]; exists {
|
||||
return isTag
|
||||
}
|
||||
|
||||
// Fall back to the base tokenizer's logic
|
||||
return false
|
||||
}
|
||||
21
parser.go
21
parser.go
|
|
@ -59,21 +59,28 @@ func (p *Parser) Parse(source string) (Node, error) {
|
|||
// Initialize default block handlers
|
||||
p.initBlockHandlers()
|
||||
|
||||
// Use the zero-allocation tokenizer for maximum performance and minimal allocations
|
||||
// Use the optimized tokenizer for maximum performance and minimal allocations
|
||||
// This will treat everything outside twig tags as TEXT tokens
|
||||
var err error
|
||||
|
||||
// Use the zero-allocation tokenizer to achieve minimal memory usage and high performance
|
||||
tokenizer := GetTokenizer(p.source, 0)
|
||||
p.tokens, err = tokenizer.TokenizeHtmlPreserving()
|
||||
// Use optimized tokenizer with global string cache for better performance
|
||||
optimizedTokenizer := NewOptimizedTokenizer()
|
||||
|
||||
// Set the source for the base tokenizer
|
||||
optimizedTokenizer.baseTokenizer.source = p.source
|
||||
optimizedTokenizer.baseTokenizer.position = 0
|
||||
optimizedTokenizer.baseTokenizer.line = 1
|
||||
|
||||
// Tokenize using the optimized tokenizer
|
||||
p.tokens, err = optimizedTokenizer.TokenizeHtmlPreserving()
|
||||
|
||||
// Apply whitespace control to handle whitespace trimming directives
|
||||
if err == nil {
|
||||
tokenizer.ApplyWhitespaceControl()
|
||||
optimizedTokenizer.ApplyWhitespaceControl()
|
||||
}
|
||||
|
||||
// Release the tokenizer back to the pool
|
||||
ReleaseTokenizer(tokenizer)
|
||||
// Return the tokenizer to the pool
|
||||
ReleaseOptimizedTokenizer(optimizedTokenizer)
|
||||
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("tokenization error: %w", err)
|
||||
|
|
|
|||
|
|
@ -1,48 +0,0 @@
|
|||
package twig
|
||||
|
||||
import (
|
||||
"io/ioutil"
|
||||
"testing"
|
||||
)
|
||||
|
||||
func BenchmarkWriteStringDirect(b *testing.B) {
|
||||
buf := NewStringBuffer()
|
||||
defer buf.Release()
|
||||
longStr := "This is a test string for benchmarking the write performance of direct byte slice conversion"
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
buf.buf.Reset()
|
||||
buf.buf.Write([]byte(longStr))
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkWriteStringOptimized(b *testing.B) {
|
||||
buf := NewStringBuffer()
|
||||
defer buf.Release()
|
||||
longStr := "This is a test string for benchmarking the write performance of optimized string writing"
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
buf.buf.Reset()
|
||||
WriteString(&buf.buf, longStr)
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkWriteStringDirect_Discard(b *testing.B) {
|
||||
longStr := "This is a test string for benchmarking the write performance of direct byte slice conversion"
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
ioutil.Discard.Write([]byte(longStr))
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkWriteStringOptimized_Discard(b *testing.B) {
|
||||
longStr := "This is a test string for benchmarking the write performance of optimized string writing"
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
WriteString(ioutil.Discard, longStr)
|
||||
}
|
||||
}
|
||||
|
|
@ -1,521 +0,0 @@
|
|||
package twig
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"strings"
|
||||
"sync"
|
||||
)
|
||||
|
||||
// ImprovedTokenSlice is a more efficient implementation of a token slice pool
|
||||
// that truly minimizes allocations during tokenization
|
||||
type ImprovedTokenSlice struct {
|
||||
tokens []Token // The actual token slice
|
||||
capacity int // Capacity hint for the token slice
|
||||
used bool // Whether this slice has been used
|
||||
}
|
||||
|
||||
// global pool for ImprovedTokenSlice objects
|
||||
var improvedTokenSlicePool = sync.Pool{
|
||||
New: func() interface{} {
|
||||
// Start with a reasonably sized token slice
|
||||
tokens := make([]Token, 0, 64)
|
||||
return &ImprovedTokenSlice{
|
||||
tokens: tokens,
|
||||
capacity: 64,
|
||||
used: false,
|
||||
}
|
||||
},
|
||||
}
|
||||
|
||||
// Global token object pool
|
||||
var tokenObjectPool = sync.Pool{
|
||||
New: func() interface{} {
|
||||
return &Token{}
|
||||
},
|
||||
}
|
||||
|
||||
// GetImprovedTokenSlice gets a token slice from the pool
|
||||
func GetImprovedTokenSlice(capacityHint int) *ImprovedTokenSlice {
|
||||
slice := improvedTokenSlicePool.Get().(*ImprovedTokenSlice)
|
||||
|
||||
// Reset the slice but keep capacity
|
||||
if cap(slice.tokens) < capacityHint {
|
||||
// Need to allocate a larger slice
|
||||
slice.tokens = make([]Token, 0, capacityHint)
|
||||
slice.capacity = capacityHint
|
||||
} else {
|
||||
// Reuse existing slice
|
||||
slice.tokens = slice.tokens[:0]
|
||||
}
|
||||
|
||||
slice.used = false
|
||||
return slice
|
||||
}
|
||||
|
||||
// AppendToken adds a token to the slice
|
||||
func (s *ImprovedTokenSlice) AppendToken(tokenType int, value string, line int) {
|
||||
if s.used {
|
||||
return // Already finalized
|
||||
}
|
||||
|
||||
// Create a token and add it to the slice
|
||||
token := Token{
|
||||
Type: tokenType,
|
||||
Value: value,
|
||||
Line: line,
|
||||
}
|
||||
|
||||
s.tokens = append(s.tokens, token)
|
||||
}
|
||||
|
||||
// Finalize returns the token slice
|
||||
func (s *ImprovedTokenSlice) Finalize() []Token {
|
||||
if s.used {
|
||||
return s.tokens
|
||||
}
|
||||
|
||||
s.used = true
|
||||
return s.tokens
|
||||
}
|
||||
|
||||
// Release returns the token slice to the pool
|
||||
func (s *ImprovedTokenSlice) Release() {
|
||||
if s.used && cap(s.tokens) <= 1024 { // Don't pool very large slices
|
||||
// Only return reasonably sized slices to the pool
|
||||
improvedTokenSlicePool.Put(s)
|
||||
}
|
||||
}
|
||||
|
||||
// optimizedTokenizeExpressionImproved is a minimal allocation version of tokenizeExpression
|
||||
func (p *Parser) optimizedTokenizeExpressionImproved(expr string, tokens *ImprovedTokenSlice, line int) {
|
||||
var inString bool
|
||||
var stringDelimiter byte
|
||||
var stringStart int
|
||||
|
||||
// Preallocate a buffer for building tokens
|
||||
buffer := make([]byte, 0, 64)
|
||||
|
||||
for i := 0; i < len(expr); i++ {
|
||||
c := expr[i]
|
||||
|
||||
// Handle string literals
|
||||
if (c == '"' || c == '\'') && (i == 0 || expr[i-1] != '\\') {
|
||||
if inString && c == stringDelimiter {
|
||||
// End of string, add the string token
|
||||
tokens.AppendToken(TOKEN_STRING, expr[stringStart:i], line)
|
||||
inString = false
|
||||
} else if !inString {
|
||||
// Start of string
|
||||
inString = true
|
||||
stringDelimiter = c
|
||||
stringStart = i + 1
|
||||
}
|
||||
continue
|
||||
}
|
||||
|
||||
// Skip chars inside strings
|
||||
if inString {
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle operators
|
||||
if isCharOperator(c) {
|
||||
// Check for two-character operators
|
||||
if i+1 < len(expr) {
|
||||
nextChar := expr[i+1]
|
||||
|
||||
if (c == '=' && nextChar == '=') ||
|
||||
(c == '!' && nextChar == '=') ||
|
||||
(c == '>' && nextChar == '=') ||
|
||||
(c == '<' && nextChar == '=') ||
|
||||
(c == '&' && nextChar == '&') ||
|
||||
(c == '|' && nextChar == '|') ||
|
||||
(c == '?' && nextChar == '?') {
|
||||
|
||||
// Two-char operator
|
||||
buffer = buffer[:0]
|
||||
buffer = append(buffer, c, nextChar)
|
||||
tokens.AppendToken(TOKEN_OPERATOR, string(buffer), line)
|
||||
i++
|
||||
continue
|
||||
}
|
||||
}
|
||||
|
||||
// Single-char operator
|
||||
tokens.AppendToken(TOKEN_OPERATOR, string([]byte{c}), line)
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle punctuation
|
||||
if isCharPunctuation(c) {
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, string([]byte{c}), line)
|
||||
continue
|
||||
}
|
||||
|
||||
// Skip whitespace
|
||||
if isCharWhitespace(c) {
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle identifiers, literals, etc.
|
||||
if isCharAlpha(c) || c == '_' {
|
||||
// Start of an identifier
|
||||
start := i
|
||||
|
||||
// Find the end
|
||||
for i++; i < len(expr) && (isCharAlpha(expr[i]) || isCharDigit(expr[i]) || expr[i] == '_'); i++ {
|
||||
}
|
||||
|
||||
// Extract the identifier
|
||||
identifier := expr[start:i]
|
||||
i-- // Adjust for loop increment
|
||||
|
||||
// Add token
|
||||
tokens.AppendToken(TOKEN_NAME, identifier, line)
|
||||
continue
|
||||
}
|
||||
|
||||
// Handle numbers
|
||||
if isCharDigit(c) || (c == '-' && i+1 < len(expr) && isCharDigit(expr[i+1])) {
|
||||
start := i
|
||||
|
||||
// Skip negative sign if present
|
||||
if c == '-' {
|
||||
i++
|
||||
}
|
||||
|
||||
// Find end of number
|
||||
for i++; i < len(expr) && isCharDigit(expr[i]); i++ {
|
||||
}
|
||||
|
||||
// Check for decimal point
|
||||
if i < len(expr) && expr[i] == '.' {
|
||||
i++
|
||||
for ; i < len(expr) && isCharDigit(expr[i]); i++ {
|
||||
}
|
||||
}
|
||||
|
||||
// Extract the number
|
||||
number := expr[start:i]
|
||||
i-- // Adjust for loop increment
|
||||
|
||||
tokens.AppendToken(TOKEN_NUMBER, number, line)
|
||||
continue
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Helper functions to reduce allocations for character checks - inlined to avoid naming conflicts
|
||||
|
||||
// isCharAlpha checks if a character is alphabetic
|
||||
func isCharAlpha(c byte) bool {
|
||||
return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')
|
||||
}
|
||||
|
||||
// isCharDigit checks if a character is a digit
|
||||
func isCharDigit(c byte) bool {
|
||||
return c >= '0' && c <= '9'
|
||||
}
|
||||
|
||||
// isCharOperator checks if a character is an operator
|
||||
func isCharOperator(c byte) bool {
|
||||
return c == '=' || c == '+' || c == '-' || c == '*' || c == '/' ||
|
||||
c == '%' || c == '&' || c == '|' || c == '^' || c == '~' ||
|
||||
c == '<' || c == '>' || c == '!' || c == '?'
|
||||
}
|
||||
|
||||
// isCharPunctuation checks if a character is punctuation
|
||||
func isCharPunctuation(c byte) bool {
|
||||
return c == '(' || c == ')' || c == '[' || c == ']' || c == '{' || c == '}' ||
|
||||
c == '.' || c == ',' || c == ':' || c == ';'
|
||||
}
|
||||
|
||||
// isCharWhitespace checks if a character is whitespace
|
||||
func isCharWhitespace(c byte) bool {
|
||||
return c == ' ' || c == '\t' || c == '\n' || c == '\r'
|
||||
}
|
||||
|
||||
// improvedHtmlPreservingTokenize is a zero-allocation version of the HTML preserving tokenizer
|
||||
func (p *Parser) improvedHtmlPreservingTokenize() ([]Token, error) {
|
||||
// Estimate token count based on source length
|
||||
estimatedTokens := len(p.source) / 20 // Rough estimate
|
||||
tokens := GetImprovedTokenSlice(estimatedTokens)
|
||||
defer tokens.Release()
|
||||
|
||||
var currentPosition int
|
||||
line := 1
|
||||
|
||||
// Reusable buffers to avoid allocations
|
||||
tagPatterns := [5]string{"{{-", "{{", "{%-", "{%", "{#"}
|
||||
tagTypes := [5]int{TOKEN_VAR_START_TRIM, TOKEN_VAR_START, TOKEN_BLOCK_START_TRIM, TOKEN_BLOCK_START, TOKEN_COMMENT_START}
|
||||
tagLengths := [5]int{3, 2, 3, 2, 2}
|
||||
|
||||
for currentPosition < len(p.source) {
|
||||
// Find the next tag
|
||||
nextTagPos := -1
|
||||
tagType := -1
|
||||
tagLength := 0
|
||||
|
||||
// Check for all possible tag patterns
|
||||
for i := 0; i < 5; i++ {
|
||||
pos := strings.Index(p.source[currentPosition:], tagPatterns[i])
|
||||
if pos != -1 {
|
||||
// Adjust position relative to current position
|
||||
pos += currentPosition
|
||||
|
||||
// If this is the first tag found or it's closer than previous ones
|
||||
if nextTagPos == -1 || pos < nextTagPos {
|
||||
nextTagPos = pos
|
||||
tagType = tagTypes[i]
|
||||
tagLength = tagLengths[i]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Check if the tag is escaped
|
||||
if nextTagPos != -1 && nextTagPos > 0 && p.source[nextTagPos-1] == '\\' {
|
||||
// Add text up to the backslash
|
||||
if nextTagPos-1 > currentPosition {
|
||||
preText := p.source[currentPosition:nextTagPos-1]
|
||||
tokens.AppendToken(TOKEN_TEXT, preText, line)
|
||||
line += countNewlines(preText)
|
||||
}
|
||||
|
||||
// Add the tag as literal text (without the backslash)
|
||||
// Find which pattern was matched
|
||||
for i := 0; i < 5; i++ {
|
||||
if tagType == tagTypes[i] {
|
||||
tokens.AppendToken(TOKEN_TEXT, tagPatterns[i], line)
|
||||
break
|
||||
}
|
||||
}
|
||||
|
||||
// Move past this tag
|
||||
currentPosition = nextTagPos + tagLength
|
||||
continue
|
||||
}
|
||||
|
||||
// No more tags found - add the rest as TEXT
|
||||
if nextTagPos == -1 {
|
||||
remainingText := p.source[currentPosition:]
|
||||
if len(remainingText) > 0 {
|
||||
tokens.AppendToken(TOKEN_TEXT, remainingText, line)
|
||||
line += countNewlines(remainingText)
|
||||
}
|
||||
break
|
||||
}
|
||||
|
||||
// Add text before the tag
|
||||
if nextTagPos > currentPosition {
|
||||
textContent := p.source[currentPosition:nextTagPos]
|
||||
tokens.AppendToken(TOKEN_TEXT, textContent, line)
|
||||
line += countNewlines(textContent)
|
||||
}
|
||||
|
||||
// Add the tag start token
|
||||
tokens.AppendToken(tagType, "", line)
|
||||
|
||||
// Move past opening tag
|
||||
currentPosition = nextTagPos + tagLength
|
||||
|
||||
// Find matching end tag
|
||||
var endTag string
|
||||
var endTagType int
|
||||
var endTagLength int
|
||||
|
||||
if tagType == TOKEN_VAR_START || tagType == TOKEN_VAR_START_TRIM {
|
||||
// Look for "}}" or "-}}"
|
||||
endPos1 := strings.Index(p.source[currentPosition:], "}}")
|
||||
endPos2 := strings.Index(p.source[currentPosition:], "-}}")
|
||||
|
||||
if endPos1 != -1 && (endPos2 == -1 || endPos1 < endPos2) {
|
||||
endTag = "}}"
|
||||
endTagType = TOKEN_VAR_END
|
||||
endTagLength = 2
|
||||
} else if endPos2 != -1 {
|
||||
endTag = "-}}"
|
||||
endTagType = TOKEN_VAR_END_TRIM
|
||||
endTagLength = 3
|
||||
} else {
|
||||
return nil, fmt.Errorf("unclosed variable tag at line %d", line)
|
||||
}
|
||||
} else if tagType == TOKEN_BLOCK_START || tagType == TOKEN_BLOCK_START_TRIM {
|
||||
// Look for "%}" or "-%}"
|
||||
endPos1 := strings.Index(p.source[currentPosition:], "%}")
|
||||
endPos2 := strings.Index(p.source[currentPosition:], "-%}")
|
||||
|
||||
if endPos1 != -1 && (endPos2 == -1 || endPos1 < endPos2) {
|
||||
endTag = "%}"
|
||||
endTagType = TOKEN_BLOCK_END
|
||||
endTagLength = 2
|
||||
} else if endPos2 != -1 {
|
||||
endTag = "-%}"
|
||||
endTagType = TOKEN_BLOCK_END_TRIM
|
||||
endTagLength = 3
|
||||
} else {
|
||||
return nil, fmt.Errorf("unclosed block tag at line %d", line)
|
||||
}
|
||||
} else if tagType == TOKEN_COMMENT_START {
|
||||
// Look for "#}"
|
||||
endPos := strings.Index(p.source[currentPosition:], "#}")
|
||||
if endPos == -1 {
|
||||
return nil, fmt.Errorf("unclosed comment at line %d", line)
|
||||
}
|
||||
endTag = "#}"
|
||||
endTagType = TOKEN_COMMENT_END
|
||||
endTagLength = 2
|
||||
}
|
||||
|
||||
// Find position of the end tag
|
||||
endPos := strings.Index(p.source[currentPosition:], endTag)
|
||||
if endPos == -1 {
|
||||
return nil, fmt.Errorf("unclosed tag at line %d", line)
|
||||
}
|
||||
|
||||
// Get content between tags
|
||||
tagContent := p.source[currentPosition:currentPosition+endPos]
|
||||
line += countNewlines(tagContent)
|
||||
|
||||
// Process tag content based on type
|
||||
if tagType == TOKEN_COMMENT_START {
|
||||
// Store comments as TEXT tokens
|
||||
if len(tagContent) > 0 {
|
||||
tokens.AppendToken(TOKEN_TEXT, tagContent, line)
|
||||
}
|
||||
} else {
|
||||
// For variable and block tags, tokenize the content
|
||||
tagContent = strings.TrimSpace(tagContent)
|
||||
|
||||
if tagType == TOKEN_BLOCK_START || tagType == TOKEN_BLOCK_START_TRIM {
|
||||
// Process block tags with optimized tokenization
|
||||
processBlockTag(tagContent, tokens, line, p)
|
||||
} else {
|
||||
// Process variable tags with optimized tokenization
|
||||
if len(tagContent) > 0 {
|
||||
if !strings.ContainsAny(tagContent, ".|[](){}\"',+-*/=!<>%&^~") {
|
||||
// Simple variable name
|
||||
tokens.AppendToken(TOKEN_NAME, tagContent, line)
|
||||
} else {
|
||||
// Complex expression
|
||||
expressionTokens := GetImprovedTokenSlice(len(tagContent) / 4)
|
||||
p.optimizedTokenizeExpressionImproved(tagContent, expressionTokens, line)
|
||||
|
||||
// Copy tokens
|
||||
for _, token := range expressionTokens.tokens {
|
||||
tokens.AppendToken(token.Type, token.Value, token.Line)
|
||||
}
|
||||
|
||||
expressionTokens.Release()
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Add the end tag token
|
||||
tokens.AppendToken(endTagType, "", line)
|
||||
|
||||
// Move past the end tag
|
||||
currentPosition = currentPosition + endPos + endTagLength
|
||||
}
|
||||
|
||||
// Add EOF token
|
||||
tokens.AppendToken(TOKEN_EOF, "", line)
|
||||
|
||||
return tokens.Finalize(), nil
|
||||
}
|
||||
|
||||
// Helper function to process block tags
|
||||
func processBlockTag(content string, tokens *ImprovedTokenSlice, line int, p *Parser) {
|
||||
// Extract the tag name
|
||||
parts := strings.SplitN(content, " ", 2)
|
||||
if len(parts) > 0 {
|
||||
blockName := parts[0]
|
||||
tokens.AppendToken(TOKEN_NAME, blockName, line)
|
||||
|
||||
// Process rest of the block content
|
||||
if len(parts) > 1 {
|
||||
blockContent := strings.TrimSpace(parts[1])
|
||||
|
||||
switch blockName {
|
||||
case "if", "elseif":
|
||||
// For conditional blocks, tokenize expression
|
||||
exprTokens := GetImprovedTokenSlice(len(blockContent) / 4)
|
||||
p.optimizedTokenizeExpressionImproved(blockContent, exprTokens, line)
|
||||
|
||||
// Copy tokens
|
||||
for _, token := range exprTokens.tokens {
|
||||
tokens.AppendToken(token.Type, token.Value, token.Line)
|
||||
}
|
||||
|
||||
exprTokens.Release()
|
||||
|
||||
case "for":
|
||||
// Process for loop with iterator(s) and collection
|
||||
inPos := strings.Index(strings.ToLower(blockContent), " in ")
|
||||
if inPos != -1 {
|
||||
iterators := strings.TrimSpace(blockContent[:inPos])
|
||||
collection := strings.TrimSpace(blockContent[inPos+4:])
|
||||
|
||||
// Handle key, value iterator syntax
|
||||
if strings.Contains(iterators, ",") {
|
||||
iterParts := strings.SplitN(iterators, ",", 2)
|
||||
if len(iterParts) == 2 {
|
||||
tokens.AppendToken(TOKEN_NAME, strings.TrimSpace(iterParts[0]), line)
|
||||
tokens.AppendToken(TOKEN_PUNCTUATION, ",", line)
|
||||
tokens.AppendToken(TOKEN_NAME, strings.TrimSpace(iterParts[1]), line)
|
||||
}
|
||||
} else {
|
||||
// Single iterator
|
||||
tokens.AppendToken(TOKEN_NAME, iterators, line)
|
||||
}
|
||||
|
||||
// Add 'in' keyword
|
||||
tokens.AppendToken(TOKEN_NAME, "in", line)
|
||||
|
||||
// Process collection expression
|
||||
collectionTokens := GetImprovedTokenSlice(len(collection) / 4)
|
||||
p.optimizedTokenizeExpressionImproved(collection, collectionTokens, line)
|
||||
|
||||
// Copy tokens
|
||||
for _, token := range collectionTokens.tokens {
|
||||
tokens.AppendToken(token.Type, token.Value, token.Line)
|
||||
}
|
||||
|
||||
collectionTokens.Release()
|
||||
} else {
|
||||
// Fallback for malformed for loops
|
||||
tokens.AppendToken(TOKEN_NAME, blockContent, line)
|
||||
}
|
||||
|
||||
case "set":
|
||||
// Handle variable assignment
|
||||
assignPos := strings.Index(blockContent, "=")
|
||||
if assignPos != -1 {
|
||||
varName := strings.TrimSpace(blockContent[:assignPos])
|
||||
value := strings.TrimSpace(blockContent[assignPos+1:])
|
||||
|
||||
tokens.AppendToken(TOKEN_NAME, varName, line)
|
||||
tokens.AppendToken(TOKEN_OPERATOR, "=", line)
|
||||
|
||||
// Tokenize value expression
|
||||
valueTokens := GetImprovedTokenSlice(len(value) / 4)
|
||||
p.optimizedTokenizeExpressionImproved(value, valueTokens, line)
|
||||
|
||||
// Copy tokens
|
||||
for _, token := range valueTokens.tokens {
|
||||
tokens.AppendToken(token.Type, token.Value, token.Line)
|
||||
}
|
||||
|
||||
valueTokens.Release()
|
||||
} else {
|
||||
// Simple set without assignment
|
||||
tokens.AppendToken(TOKEN_NAME, blockContent, line)
|
||||
}
|
||||
|
||||
default:
|
||||
// Other block types
|
||||
tokens.AppendToken(TOKEN_NAME, blockContent, line)
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
@ -1,165 +0,0 @@
|
|||
package twig
|
||||
|
||||
import (
|
||||
"sync"
|
||||
)
|
||||
|
||||
// This file implements optimized token handling functions to reduce allocations
|
||||
// during the tokenization process.
|
||||
|
||||
// PooledToken represents a token from the token pool
|
||||
// We use a separate struct to avoid accidentally returning the same instance
|
||||
type PooledToken struct {
|
||||
token *Token // Reference to the token from the pool
|
||||
}
|
||||
|
||||
// PooledTokenSlice is a slice of tokens with a reference to the original pooled slice
|
||||
type PooledTokenSlice struct {
|
||||
tokens []Token // The token slice
|
||||
poolRef *[]Token // Reference to the original slice from the pool
|
||||
used bool // Whether this slice has been used
|
||||
tmpPool sync.Pool // Pool for temporary token objects
|
||||
scratch []*Token // Scratch space for temporary tokens
|
||||
}
|
||||
|
||||
// GetPooledTokenSlice gets a token slice from the pool with the given capacity hint
|
||||
func GetPooledTokenSlice(capacityHint int) *PooledTokenSlice {
|
||||
slice := &PooledTokenSlice{
|
||||
tmpPool: sync.Pool{
|
||||
New: func() interface{} {
|
||||
return &Token{}
|
||||
},
|
||||
},
|
||||
scratch: make([]*Token, 0, 16), // Pre-allocate scratch space
|
||||
used: false,
|
||||
}
|
||||
|
||||
// Get a token slice from the pool
|
||||
pooledSlice := GetTokenSlice(capacityHint)
|
||||
slice.tokens = pooledSlice
|
||||
slice.poolRef = &pooledSlice
|
||||
|
||||
return slice
|
||||
}
|
||||
|
||||
// AppendToken adds a token to the slice using pooled tokens
|
||||
func (s *PooledTokenSlice) AppendToken(tokenType int, value string, line int) {
|
||||
if s.used {
|
||||
// This slice has already been finalized, can't append anymore
|
||||
return
|
||||
}
|
||||
|
||||
// Get a token from the pool
|
||||
token := s.tmpPool.Get().(*Token)
|
||||
token.Type = tokenType
|
||||
token.Value = value
|
||||
token.Line = line
|
||||
|
||||
// Keep a reference to this token so we can clean it up later
|
||||
s.scratch = append(s.scratch, token)
|
||||
|
||||
// Add a copy of the token to the slice
|
||||
s.tokens = append(s.tokens, *token)
|
||||
}
|
||||
|
||||
// Finalize returns the token slice and cleans up temporary tokens
|
||||
func (s *PooledTokenSlice) Finalize() []Token {
|
||||
if s.used {
|
||||
// Already finalized
|
||||
return s.tokens
|
||||
}
|
||||
|
||||
// Mark as used so we don't accidentally use it again
|
||||
s.used = true
|
||||
|
||||
// Clean up temporary tokens
|
||||
for _, token := range s.scratch {
|
||||
token.Value = ""
|
||||
s.tmpPool.Put(token)
|
||||
}
|
||||
|
||||
// Clear scratch slice but keep capacity
|
||||
s.scratch = s.scratch[:0]
|
||||
|
||||
return s.tokens
|
||||
}
|
||||
|
||||
// Release returns the token slice to the pool
|
||||
func (s *PooledTokenSlice) Release() {
|
||||
if s.poolRef != nil {
|
||||
ReleaseTokenSlice(*s.poolRef)
|
||||
s.poolRef = nil
|
||||
}
|
||||
|
||||
// Clean up any remaining temporary tokens
|
||||
for _, token := range s.scratch {
|
||||
token.Value = ""
|
||||
s.tmpPool.Put(token)
|
||||
}
|
||||
|
||||
// Clear references
|
||||
s.scratch = nil
|
||||
s.tokens = nil
|
||||
s.used = true
|
||||
}
|
||||
|
||||
// getPooledToken gets a token from the pool (for internal use)
|
||||
func getPooledToken() *Token {
|
||||
return TokenPool.Get().(*Token)
|
||||
}
|
||||
|
||||
// releasePooledToken returns a token to the pool (for internal use)
|
||||
func releasePooledToken(token *Token) {
|
||||
if token == nil {
|
||||
return
|
||||
}
|
||||
token.Value = ""
|
||||
TokenPool.Put(token)
|
||||
}
|
||||
|
||||
// TOKEN SLICES - additional optimization for token slice reuse
|
||||
|
||||
// TokenNodePool provides a pool for pre-sized token node arrays
|
||||
var TokenNodePool = sync.Pool{
|
||||
New: func() interface{} {
|
||||
// Default capacity that covers most cases
|
||||
slice := make([]Node, 0, 32)
|
||||
return &slice
|
||||
},
|
||||
}
|
||||
|
||||
// GetTokenNodeSlice gets a slice of Node from the pool
|
||||
func GetTokenNodeSlice(capacityHint int) *[]Node {
|
||||
slice := TokenNodePool.Get().(*[]Node)
|
||||
|
||||
// If the capacity is too small, allocate a new slice
|
||||
if cap(*slice) < capacityHint {
|
||||
*slice = make([]Node, 0, capacityHint)
|
||||
} else {
|
||||
// Otherwise, clear the slice but keep capacity
|
||||
*slice = (*slice)[:0]
|
||||
}
|
||||
|
||||
return slice
|
||||
}
|
||||
|
||||
// ReleaseTokenNodeSlice returns a slice of Node to the pool
|
||||
func ReleaseTokenNodeSlice(slice *[]Node) {
|
||||
if slice == nil {
|
||||
return
|
||||
}
|
||||
|
||||
// Only pool reasonably sized slices
|
||||
if cap(*slice) > 1000 || cap(*slice) < 32 {
|
||||
return
|
||||
}
|
||||
|
||||
// Clear references to help GC
|
||||
for i := range *slice {
|
||||
(*slice)[i] = nil
|
||||
}
|
||||
|
||||
// Clear slice but keep capacity
|
||||
*slice = (*slice)[:0]
|
||||
TokenNodePool.Put(slice)
|
||||
}
|
||||
|
|
@ -1,333 +0,0 @@
|
|||
package twig
|
||||
|
||||
import (
|
||||
"testing"
|
||||
)
|
||||
|
||||
func BenchmarkHtmlPreservingTokenize(b *testing.B) {
|
||||
// A sample template with HTML and Twig tags
|
||||
source := `<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>{{ title }}</title>
|
||||
<meta charset="utf-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="{{ asset_url('styles.css') }}">
|
||||
</head>
|
||||
<body>
|
||||
<header>
|
||||
<h1>{{ page.title }}</h1>
|
||||
<nav>
|
||||
<ul>
|
||||
{% for item in menu %}
|
||||
<li><a href="{{ item.url }}">{{ item.label }}</a></li>
|
||||
{% endfor %}
|
||||
</ul>
|
||||
</nav>
|
||||
</header>
|
||||
|
||||
<main>
|
||||
{% if content %}
|
||||
<article>
|
||||
{{ content|raw }}
|
||||
</article>
|
||||
{% else %}
|
||||
<p>No content available.</p>
|
||||
{% endif %}
|
||||
|
||||
{% block sidebar %}
|
||||
<aside>
|
||||
{% include "sidebar.twig" with {items: sidebar_items} %}
|
||||
</aside>
|
||||
{% endblock %}
|
||||
</main>
|
||||
|
||||
<footer>
|
||||
<p>© {{ "now"|date("Y") }} {{ site_name }}</p>
|
||||
</footer>
|
||||
</body>
|
||||
</html>`
|
||||
|
||||
parser := &Parser{source: source}
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
_, _ = parser.htmlPreservingTokenize()
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkOptimizedHtmlPreservingTokenize(b *testing.B) {
|
||||
// Sample template
|
||||
source := `<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>{{ title }}</title>
|
||||
<meta charset="utf-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="{{ asset_url('styles.css') }}">
|
||||
</head>
|
||||
<body>
|
||||
<header>
|
||||
<h1>{{ page.title }}</h1>
|
||||
<nav>
|
||||
<ul>
|
||||
{% for item in menu %}
|
||||
<li><a href="{{ item.url }}">{{ item.label }}</a></li>
|
||||
{% endfor %}
|
||||
</ul>
|
||||
</nav>
|
||||
</header>
|
||||
|
||||
<main>
|
||||
{% if content %}
|
||||
<article>
|
||||
{{ content|raw }}
|
||||
</article>
|
||||
{% else %}
|
||||
<p>No content available.</p>
|
||||
{% endif %}
|
||||
|
||||
{% block sidebar %}
|
||||
<aside>
|
||||
{% include "sidebar.twig" with {items: sidebar_items} %}
|
||||
</aside>
|
||||
{% endblock %}
|
||||
</main>
|
||||
|
||||
<footer>
|
||||
<p>© {{ "now"|date("Y") }} {{ site_name }}</p>
|
||||
</footer>
|
||||
</body>
|
||||
</html>`
|
||||
|
||||
parser := &Parser{source: source}
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
_, _ = parser.optimizedHtmlPreservingTokenize()
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkImprovedHtmlPreservingTokenize(b *testing.B) {
|
||||
// Sample template (same as above)
|
||||
source := `<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>{{ title }}</title>
|
||||
<meta charset="utf-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="{{ asset_url('styles.css') }}">
|
||||
</head>
|
||||
<body>
|
||||
<header>
|
||||
<h1>{{ page.title }}</h1>
|
||||
<nav>
|
||||
<ul>
|
||||
{% for item in menu %}
|
||||
<li><a href="{{ item.url }}">{{ item.label }}</a></li>
|
||||
{% endfor %}
|
||||
</ul>
|
||||
</nav>
|
||||
</header>
|
||||
|
||||
<main>
|
||||
{% if content %}
|
||||
<article>
|
||||
{{ content|raw }}
|
||||
</article>
|
||||
{% else %}
|
||||
<p>No content available.</p>
|
||||
{% endif %}
|
||||
|
||||
{% block sidebar %}
|
||||
<aside>
|
||||
{% include "sidebar.twig" with {items: sidebar_items} %}
|
||||
</aside>
|
||||
{% endblock %}
|
||||
</main>
|
||||
|
||||
<footer>
|
||||
<p>© {{ "now"|date("Y") }} {{ site_name }}</p>
|
||||
</footer>
|
||||
</body>
|
||||
</html>`
|
||||
|
||||
parser := &Parser{source: source}
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
_, _ = parser.improvedHtmlPreservingTokenize()
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkZeroAllocHtmlTokenize(b *testing.B) {
|
||||
// Same sample template used in other benchmarks
|
||||
source := `<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>{{ title }}</title>
|
||||
<meta charset="utf-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||||
<link rel="stylesheet" href="{{ asset_url('styles.css') }}">
|
||||
</head>
|
||||
<body>
|
||||
<header>
|
||||
<h1>{{ page.title }}</h1>
|
||||
<nav>
|
||||
<ul>
|
||||
{% for item in menu %}
|
||||
<li><a href="{{ item.url }}">{{ item.label }}</a></li>
|
||||
{% endfor %}
|
||||
</ul>
|
||||
</nav>
|
||||
</header>
|
||||
|
||||
<main>
|
||||
{% if content %}
|
||||
<article>
|
||||
{{ content|raw }}
|
||||
</article>
|
||||
{% else %}
|
||||
<p>No content available.</p>
|
||||
{% endif %}
|
||||
|
||||
{% block sidebar %}
|
||||
<aside>
|
||||
{% include "sidebar.twig" with {items: sidebar_items} %}
|
||||
</aside>
|
||||
{% endblock %}
|
||||
</main>
|
||||
|
||||
<footer>
|
||||
<p>© {{ "now"|date("Y") }} {{ site_name }}</p>
|
||||
</footer>
|
||||
</body>
|
||||
</html>`
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
tokenizer := GetTokenizer(source, 0)
|
||||
_, _ = tokenizer.TokenizeHtmlPreserving()
|
||||
ReleaseTokenizer(tokenizer)
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkTokenizeExpression(b *testing.B) {
|
||||
source := `user.name ~ " is " ~ user.age ~ " years old and lives in " ~ user.address.city`
|
||||
parser := &Parser{source: source}
|
||||
tokens := make([]Token, 0, 30)
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
tokens = tokens[:0]
|
||||
parser.tokenizeExpression(source, &tokens, 1)
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkOptimizedTokenizeExpression(b *testing.B) {
|
||||
source := `user.name ~ " is " ~ user.age ~ " years old and lives in " ~ user.address.city`
|
||||
parser := &Parser{source: source}
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
tokenSlice := GetPooledTokenSlice(30)
|
||||
parser.optimizedTokenizeExpression(source, tokenSlice, 1)
|
||||
tokenSlice.Release()
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkImprovedTokenizeExpression(b *testing.B) {
|
||||
source := `user.name ~ " is " ~ user.age ~ " years old and lives in " ~ user.address.city`
|
||||
parser := &Parser{source: source}
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
tokenSlice := GetImprovedTokenSlice(30)
|
||||
parser.optimizedTokenizeExpressionImproved(source, tokenSlice, 1)
|
||||
tokenSlice.Release()
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkZeroAllocTokenize(b *testing.B) {
|
||||
source := `user.name ~ " is " ~ user.age ~ " years old and lives in " ~ user.address.city`
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
tokenizer := GetTokenizer(source, 30)
|
||||
tokenizer.TokenizeExpression(source)
|
||||
ReleaseTokenizer(tokenizer)
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkComplexTokenize(b *testing.B) {
|
||||
// A more complex example with nested structures
|
||||
source := `{% for user in users %}
|
||||
{% if user.active %}
|
||||
<div class="user {{ user.role }}">
|
||||
<h2>{{ user.name|title }}</h2>
|
||||
<p>{{ user.bio|striptags|truncate(100) }}</p>
|
||||
|
||||
{% if user.permissions is defined and 'admin' in user.permissions %}
|
||||
<span class="admin-badge">Admin</span>
|
||||
{% endif %}
|
||||
|
||||
<ul class="contact-info">
|
||||
{% for method, value in user.contacts %}
|
||||
<li class="{{ method }}">{{ value }}</li>
|
||||
{% endfor %}
|
||||
</ul>
|
||||
|
||||
{% set stats = user.getStatistics() %}
|
||||
<div class="stats">
|
||||
<span>Posts: {{ stats.posts }}</span>
|
||||
<span>Comments: {{ stats.comments }}</span>
|
||||
<span>Last active: {{ stats.lastActive|date("d M Y") }}</span>
|
||||
</div>
|
||||
</div>
|
||||
{% else %}
|
||||
<!-- User {{ user.name }} is inactive -->
|
||||
{% endif %}
|
||||
{% endfor %}`
|
||||
|
||||
parser := &Parser{source: source}
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
_, _ = parser.optimizedHtmlPreservingTokenize()
|
||||
}
|
||||
}
|
||||
|
||||
func BenchmarkTokenizeComplexObject(b *testing.B) {
|
||||
// A complex object with nested structures
|
||||
source := `{
|
||||
name: "John Doe",
|
||||
age: 30,
|
||||
address: {
|
||||
street: "123 Main St",
|
||||
city: "New York",
|
||||
country: "USA"
|
||||
},
|
||||
preferences: {
|
||||
theme: "dark",
|
||||
notifications: true,
|
||||
privacy: {
|
||||
showEmail: false,
|
||||
showPhone: true
|
||||
}
|
||||
},
|
||||
contacts: ["john@example.com", "+1234567890"],
|
||||
scores: [95, 87, 92, 78],
|
||||
metadata: {
|
||||
created: "2023-01-15",
|
||||
modified: "2023-06-22",
|
||||
tags: ["user", "premium", "verified"]
|
||||
}
|
||||
}`
|
||||
|
||||
b.ResetTimer()
|
||||
for i := 0; i < b.N; i++ {
|
||||
tokenSlice := GetPooledTokenSlice(100)
|
||||
optimizedTokenizeComplexObject(source, tokenSlice, 1)
|
||||
tokenSlice.Release()
|
||||
}
|
||||
}
|
||||
Loading…
Add table
Add a link
Reference in a new issue