vessel/doc/dynamic-path.md
2025-06-24 13:17:46 +03:00

427 lines
18 KiB
Markdown

# Dynamic Route Path Format Specification for Vessel
This specification defines the dynamic route template system for Vessel. This system enables developers to create routes with dynamic segments that can accept various data types, constraints, and optional values.
## Static Segments
Static segments refer to parts of a URL that do not change dynamically. By default, a static segment matches exactly as written. For example:
```
/hello/world
```
This pattern matches `/hello/world`.
Static segments can include optional characters or even entire segments. For instance, marking a single character as optional can be done using `?`:
```
/h?ello/world
```
This matches both `/hello/world` and `/ello/world`.
Similarly, an entire static segment can be made optional by prefixing it with `?`. For example:
```
?/hello/world/
```
This matches `/hello/world/1234` as well as just `1234`.
You may also combine both character-optional and section-optional flags:
```
?/hel?lo/world/<int>
```
This matches all `/hello/world/1234`, `/helo/world/1234`, and `1234`.
## Syntax
The route template syntax uses a clear distinction between static path components and dynamic segments. Dynamic segments are enclosed in angle brackets and follow this format:
```
<type[!][(arg)][:key][(?[=default]]>
```
Where:
- `type` specifies the data type of the segment (**required**)
- `!` marks the segment as no-convert to not convert the value to its native type, leaving it as string (optional)
- Useful for large values!
- `(arg)` defines optional constraints or arguments for the type (optional)
- `:key` names the parameter for accessing the captured value (optional)
- `?` marks the segment as optional (optional)
- `=default` specifies a default value for optional segments (optional)
For example:
```
/users/<int(1:100):user_id>/posts/<str:title?>
```
Note that dynamic segments can appear anywhere in the path:
```
/document-<int:version>.pdf // Within a segment
/<int:year>/<int(1:12):month>/<int(1:31):day> // Multiple segments
/prefix-<str:name>-suffix // Inside a path component
```
And ranges may have only one value where `int(10)` is the same as `int(10:10)`.
## Type System
The system supports the following fundamental data types, which call to internal C functions (_and not regex, never regex_):
| Type | Description | Argument | Native type (Vessel's choice in **bold**) | Example match |
| -------- | -------------------------------------------------------- | ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------ | ------------------------------------ |
| `int` | Integer values (from `-(10^255-1)` to `10^256-1`) | Optional `int` range. | **64-bit to 128-bit integer** or double | -10, 0, 10 |
| `str` | String values (no slashes) | Optional `int` range describing the string length. | **String** | hello, user-1 |
| `path` | Path segments (with slashes) | Optional `int` range describing the path length. | **String** | docs/intro/start |
| `float` | Floating-point values (from `-(10^254-1)` to `10^255-1`) | Optional `int` range. | **Double** | 3.14, -0.5, 0, 1 |
| `double` | `float` values, floating part required. | Optional `int` range. | **Double** | 3.14, -0.5 |
| `bool` | Boolean values (case-insensitive, human readable) | Optional `bool` list. | 8-bit integer or **boolean** | true, 1, YES, Up, false, 0, no, down |
| `date` | Date values | Optional date format, by default `%Y-%m-%d` is assumed. More about time below. | **Double (UNIX time-stamp)**, 64 to 128 bit integer (UNIX time-stamp), or date-time object | 2025-03-26 |
| `uuid` | UUID values | Optional UUID version. If `0` or unspecified it is version-agnostic. | **String** | 0fdc17bc-e190-4466-8ad1-ce2299193d29 |
| `hex` | Hexadecimal values | Optional `int` range describing the hex string length. | **String** | ca73422984b732c, 13e63d4bb0f658 |
| `nop` | No-operation. | - | - | - |
### Constraints
Types can accept arguments that constrain valid values:
```
<int(1:100):page> // Integer between 1 and 100
<str(3:20):username> // String between 3 and 20 characters
<str(255):lowercase> // String with static length
<date(%m-%Y-%d):date> // Date in specific format
<float(0:1):ratio> // Float between 0.0 and 1.0
```
#### `int` ranges
The syntax for `int` ranges is as follows:
```
a:b/step
```
In this syntax, `a`, `b`, and `step` are all integer values:
- `a` and `b` are signed integers.
- `step` is an unsigned integer.
The range is inclusive, meaning it includes both the start `a` and end `b` values.
Each component (`a`, `b`, and `step`) is optional and can be omitted. Here's what each variation means:
- `a:a` - only the integer `a`.
- `a` - same as `a:a`
- `:` - all integers from negative infinity to positive infinity.
- `:/step` - all integers from negative infinity to positive infinity, but only those divisible by `step`.
- `/step` - same as `:/step`.
- `a:` - all integers from `a` to positive infinity.
- `:b` - all integers from negative infinity to `b`.
- `a:b` - all integers from `a` to `b`.
- `a:/step` - all integers from `a` to positive infinity, with each integer being a multiple of `step`.
- `:b/step` - all integers from negative infinity to `b`, with each integer being a multiple of `step`.
- `a:b/step` - all integers from `a` to `b`, with each integer being a multiple of `step`.
Where:
- Start (`a`): The inclusive beginning of the range.
- End (`b`): The inclusive end of the range.
- Step (`step`): The interval at which integers are selected within the range. For example, a step of 2 would select every other integer.
Note that spaces before and after range are ignored.
#### `bool` list
The syntax for `bool` lists is straightforward and case-insensitive, separated by `/`. It consists of two parts: truthy values and falsy values:
```
true 1 yes up / false 0 no down
```
In this syntax:
- The first part lists values that are considered true (e.g., `true`, `1`, `yes`, `up`).
- The second part lists values that are considered false (e.g., `false`, `0`, `no`, `down`).
Only one of the parts is required, that is all of these are valid:
- `/ falsy values`
- `truthy values /`
- `truthy values` (no slash)
- `truthy values / falsy values`
Note that spaces are ignored and are used only for separation.
### `date` format
| Specifier | Description | Example |
| --------- | ---------------------------------------------------- | ------------ |
| `%Y` | Year in four digits | `2025` |
| `%y` | Year in two digits (00-99) | `25` |
| `%m` | Month as a two-digit number (01-12) | `03` (March) |
| `%b` | Abbreviated month name | `Mar` |
| `%B` | Full month name | `March` |
| `%d` | Day of the month as a two-digit number (01-31) | `26` |
| `%j` | Day of the year as a three-digit number (001-366) | `096` |
| `%H` | Hour in 24-hour format (00-23) | `21` (9 PM) |
| `%I` | Hour in 12-hour format (01-12) | `09` (9 AM) |
| `%M` | Minute as a two-digit number (00-59) | `44` |
| `%S` | Second as a two-digit number (00-59) | `00` |
| `%p` | AM/PM indicator | `PM` |
| `%A` | Full weekday name | `Sunday` |
| `%a` | Abbreviated weekday name | `Sun` |
| `%w` | Weekday as a decimal number (0-6), where Sunday is 0 | `0` (Sunday) |
| `%W` | Week of the year as a decimal number (00-53) | `13` |
| `%Z` | Time zone abbreviation | `EEST` |
| `%z` | Time zone offset from UTC in hours and minutes | `+0300` |
| `%%` | Literal `%` | `%` |
#### Supported `uuid` versions
| UUID Version | Description | Example ([v] - static version ID) | Key Characteristics |
| ------------ | ------------------------------------------------------------------------------------- | ---------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `v1` | Time-based UUID generated using the time-stamp and MAC address of the host. | `c9bab110-0757-[1]1f0-9e73-df019ce9bbd0` | Includes time-stamp and MAC address; not anonymous but ensures uniqueness. |
| `v2` | DCE Security UUID that includes time-stamp, MAC address, and local domain identifier. | `000001f5-5e9a-[2]1ea-9e00-0242ac130003` | Replaces parts of the time-stamp with a local identifier (UID/GID); limited to 64 different UUIDs per 7-minute period; reveals when, where, and by whom it was created. |
| `v3` | Namespace-based UUID using MD5 hashing. | `3d813cbb-47fb-[3]2ba-91df-831e1593ac29` | Deterministic; same input string and namespace produce the same UUID; based on MD5 hash of namespace and name. |
| `v4` | Randomly generated UUID with no inherent logic. | `0b2c3f13-4f0c-[4]83e-a1da-6a6ce1675fc5` | Fully random; highly unlikely to collide due to 2^122 possible combinations; most commonly used version. |
| `v5` | Namespace-based UUID using SHA1 hashing (successor to v3). | `0a959265-f1f5-[5]8c2-988c-71bbb7d6a8e0` | Deterministic; more secure than v3 due to SHA1 hashing; recommended over v3 by RFC 4122. |
| `v6` | Reordered time-based UUID (improvement over v1). | `1a47bc20-a6ce-[6]b7d-88c7-0a959265f1f5` | Same data as v1 but reordered to be sortable by time-stamp; provides time-ordered sequence. |
| `v7` | Time-ordered UUID with random data. | `017f22e2-79b0-[7]c9e-9ab2-cfe0d5a716fa` | Combines time-stamp with random data; sortable by creation time while maintaining privacy. |
| `v8` | Custom UUID format with user-defined data. | `b4a2f5d1-ec8d-[8]7a3-96e5-2bc41f0d7e3a` | Allows custom implementation with only version and variant bits required; completely customizable. |
`v` is optional however if you include it it'll be ignored. Spaces before and after the version are ignored.
### Custom Types
Custom data types can be call to using a dollar sign prefix:
```
<$email:contact> // Custom email type
<$hex_clr:background> // Custom hex color type
```
Each custom type must be registered with a C function that validates and parses the input.
### Wildcard and Path Matching
The `path` type acts as a wildcard that can match multiple segments:
```
/docs/<path:article_path> // Matches /docs/intro, /docs/advanced/routing, etc.
```
The `path` type captures all remaining segments including slashes, making it ideal for flexible route patterns. No other segments after a single `path` segment will be validated.
## Parameter Capture and Storage
Dynamic segment values are captured and stored under the specified key name:
```
<int:user_id> // Captures an integer and stores it as "user_id"
```
Segments can be marked as optional with a question mark:
```
<str:category?> // Optional category parameter
```
Default values can be provided for optional segments:
```
<int:page?=1> // Default page is 1 if not specified
<str:sort?=name> // Default sort is "name" if not specified
```
### Behaviour
- If an optional segment is present in the URL, its value is used.
- If absent with a default, the default value is used.
- If absent without a default, the key will not be present in the parameters.
### Validation mode
A segment can validate URL format without capturing the value by omitting the key:
```
<int(1:100)> // Validates as integer in range but doesn't store
```
In validation mode:
- No value is stored for the segment.
- Default values are ignored (it's invalid syntax)
- If the segment doesn't match, the route doesn't match (true for all routes)
## Case Sensitivity
Case-insensitive elements:
- Static path segments
- Type names (`int`, `INT`, `Int` are equivalent)
- Key names (always lower-case)
Case-sensitive elements:
- Actual path values in the URL (unless specified otherwise)
- Arguments and constraints (unless specified otherwise)
- Default values (unless specified otherwise)
## Examples
### Basic Routes
```
/about // Static route
/users/<int:user_id> // Simple dynamic route
/articles/<int:year>/<str:title> // Multiple segments
```
### Optional Segments
```
/products/<int:page?=1> // Optional with default
/files/<path:filepath?> // Optional path segment
/search/<str:query?=> // Optional with empty default
```
### Constrained Values
```
/pages/<int(1:100):page> // Integer range
/register/<str(5:20):username> // String length
```
### Combined Examples
```
/api/v<int(1:3):version>/users/<uuid:user_id>/posts/<int:post_id?>
```
This matches paths like:
- `/api/v1/users/0fdc17bc-e190-4466-8ad1-ce2299193d29/posts/42`
- `/api/v2/users/0fdc17bc-e190-4466-8ad1-ce2299193d29/posts`
```
/archive/<int(1900:2100):year>/<int(1:12):month?>/<int(1:31):day?>
```
This matches paths like:
- `/archive/2025`
- `/archive/2025/3`
- `/archive/2025/3/26`
```
/shop/<str:category>/<str:subcategory?>/<str:product_slug>-<int:product_id>
```
This matches paths like:
- `/shop/electronics/smartphones/hello-world-12345`
- `/shop/electronics/hello-world-pro-12345`
## Query Parameters
Query parameters are handled automatically and not processed in the route template, however some routes may disable query parameters via an internal flag to avoid memory and processing overhead.
## Edge Cases and Special Considerations
1. Empty segments should be handled explicitly:
```
/tags/<str:tag?=> // Empty string is allowed as a default
```
2. Use of adjacent dynamic segments:
```
/<int:id><str:suffix> // Requires precise boundary detection
```
Although this is technically feasible, it is strongly advised against. This approach is considered poor practice and can result in significant routing problems.
3. To use literal angle brackets (or any character) in static parts:
```
/literal\<not-a-dynamic-segment\>
```
Note that _any_ character after a backslash will be treated as literal not just angle brackets.
4. Duplicate key handling:
```
/users/<int:id>/posts/<int:id> // Second `id` overwrites first. First validates as int but isn't stored.
```
The first occurrence takes precedence, proceeding segments with the same key become validation-only.
5. The path `type` must always be the last segment in a route:
Valid:
```
/files/<path:filepath>
```
Invalid:
```
/files/<path:filepath>/<int:version>
```
6. Optional segments followed by required segments are invalid:
Invalid:
```
/users/<int:id?>/<str:name>
```
7. Default value validation:
Valid:
```
<int(1:10):page?=5> // Default is within range
```
Invalid:
```
<int(1:10):page?=15> // Default violates range
```
8. Case sensitivity conflicts:
```
/Hello/<str:Name> // Static "Hello" is case-insensitive, `Name` key is stored as lowercase "name"
```
9. Custom types cannot override built-in types:
Invalid:
```
<$int:custom_int> // "int" is reserved
```
10. Ambiguous static/dynamic boundaries:
```
/abc<int:x>def // Matches "/abc123def" (x=123) but not "/abc123/def"
```
Avoid ambiguity wherever possible.
## Compilation and Interpretation
Before a dynamic path template is utilized, it is compiled into an internal format. Once compiled, the template becomes read-only and cannot be modified. This compilation step is performed for optimization purposes. During the process, the template is both optimized and validated to ensure that no errors are present.