Commit graph

16 commits

Author SHA1 Message Date
Alexandre Gomes Gaigalas
d4d04b87cd Fix PublicSuffix validator and UpdateDomainSuffixesCommand
- Parce PSL ICANN section into structured sections (rules,
   wildcards, exceptions) according to the format.
 - Updates PublicSuffix semantics for complete application of
   the rules.
 - Includes private domain suffixes now.
 - Refreshes the existing data.
 - Fixes the update-regionals.yml workflow, set it to run
   twice a week.

References: https://github.com/publicsuffix/list/wiki/Format#format
2026-03-04 16:23:25 +00:00
copilot-swe-agent[bot]
eedce8fb32 Use Punycode filenames for non-ASCII TLD suffix data files
Some systems and tools (e.g., certain archive extractors, Windows
environments, or CI pipelines) do not properly handle non-ASCII
characters in file paths. The public suffix data files for
internationalized TLDs (such as ישראל, СРБ, 香港, and ไทย) were stored
using their native Unicode names, which caused installation failures
on those systems.

This commit converts those filenames to their Punycode equivalents
(e.g., XN--4DBRK0CE.php instead of ישראל.php) using `idn_to_ascii()`.
Both the data generation command (`UpdateDomainSuffixesCommand`) and the
runtime validator (`PublicDomainSuffix`) are updated to use the same
Punycode-based file lookup, ensuring consistency. A polyfill dependency
(`symfony/polyfill-intl-idn`) is added so that `idn_to_ascii()` is
available even when the `intl` PHP extension is not installed.

Assisted-by: Claude Code (Claude Opus 4.6)
Co-authored-by: Henrique Moody <henriquemoody@gmail.com>
2026-02-09 17:34:56 +01:00
Henrique Moody
dc0c0345c9
Create "Formatted" validator
The Formatted validator decorates another validator to transform how
input values appear in error messages, while still validating the
original unmodified input.

This is useful for improving the readability of error messages by
displaying values in a user-friendly formatd.

The validator accepts any Respect\StringFormatter\Formatter implementation,
allowing direct use of StringFormatter's fluent builder. As StringFormatter
expands with more formatters in future releases, users will automatically
benefit from the full range of formatting options.

Assisted-by: Claude Code (Opus 4.5)
2026-02-06 20:44:26 +01:00
Henrique Moody
7c681fec66
Fix SPDX headers in all files
I ran the `bin/console spdx --fix` with different strategies for
different files. For most of the core classes, since they've been
drastically rebuilt, I've run it with the `git-blame` strategy, for for
the `src/Validators`, in which the API changed completely but the logic
remains the same, I use the `git-log` strategy.
2026-02-03 15:23:23 +01:00
Henrique Moody
7db3bea8a6
Enhance LintSpdxCommand with contributor tracking and header normalization
Improves SPDX header linting to ensure consistent license metadata across
the codebase.

Key changes:

- Enforce deterministic tag ordering (License-Identifier, FileCopyrightText,
  FileContributor) to ensure consistency, prevent merge conflicts, and
  simplify code reviews

- Add contributor alias mapping to consolidate contributors with multiple
  emails or name variations (e.g., "nickl-" → "Nick Lombard")

- Add --contributions-strategy option with "blame" (current code authors)
  and "log" (all historical contributors) to support different attribution
  philosophies

- Add optional path argument to lint specific files or directories

- Add --fix option to automatically correct header issues

Assisted-by: Claude Code (claude-opus-4-5-20251101)
2026-02-03 15:23:20 +01:00
Henrique Moody
63f7753e43
Rename "Call" validator to "After"
The previous name was confusing as it focused on the implementation
detail (calling a callable) rather than what the validator actually
does. The new name "After" better conveys the validator's purpose:
validating the input after applying a transformation.

Assisted-by: Claude Code (Opus 4.5)
2026-02-02 01:46:12 +01:00
Henrique Moody
4390e4feb6
Simplify how we load and save files in data/
We had different ways of saving and loading files from `data/`, so I decided to
unify them to simplify things. I repurposed the `DomainInfo` class and named it
`DataLoader`, so we can use the same class to load anything from the `data/`
directory.
2026-01-26 20:28:29 +01:00
Henrique Moody
819d734a00
Check for mismatches in the mixin classes
When we change the contract of a validator, or create a new one, we need to
ensure that the mixin for the validator is present and matches the validator's
constructor.

This commit changes the current class that generates those mixin classes,
converting it into a linter so we can run it in the GitHub workflow to check for
missing changes.
2026-01-26 20:14:09 +01:00
Henrique Moody
0190f3e109
Group lint-related commands together
Since we have so many lint-related commands now, it makes sense to group
them together to it's easier to spot them.
2026-01-26 20:14:09 +01:00
Alexandre Gomes Gaigalas
a91517108e Lint SPDX conventions
The `reuse lint` command only checks for REUSE compliance, which
will accept all sorts of SPDX headers.

In this project, however, we have also other conventions. For
example, we require all PHP and docs files from the project
to have a specific license (not just any license) and also a
specific File Copyright Text (not just any copyright).

This commit introduces a command to solve this problem, validating
the headers more thoroughly.

The introduced command also does some dogfooding, using validators
from the library itself to perform some of its tasks, namely: Call,
Each, Contains, Templated and Named, showcasing potential different
use cases for the project.
2026-01-26 16:07:09 +00:00
Henrique Moody
140bd36aa3
Rename library/ to src/
We've always considered renaming this directory, as it's not a common
standard to name `library` the directory where the source code of a
library it. Having it as `src/` is a common pattern we find in several
PHP libraries these days.

Acked-by: Alexandre Gomes Gaigalas <alganet@gmail.com>
2026-01-22 13:13:15 +01:00
Alexandre Gomes Gaigalas
d9cdc118b2 Introduce REUSE compliance
This commit introduces REUSE compliance by annotating all files
with SPDX information and placing the reused licences in the
LICENSES folder.

We additionally removed the docheader tool which is made obsolete
by this change.

The main LICENSE and copyright text of the project is now not under
my personal name anymore, and it belongs to "The Respect Project
Contributors" instead.

This change restores author names to several files, giving the
appropriate attribution for contributions.
2026-01-21 06:28:11 +00:00
Alexandre Gomes Gaigalas
3270c1f72c Make all remaining validators serializable
This commit concludes the effort to make all current validators
serializable by fixing the remaining ones.

The ability to use `finfo` instances on some filesystem validators
was removed. `Image` was refactored to be a readonly class.

A command was added to the main `composer qa` flow that checks
if all validators are covered by smoke tests (which are currently
used by benchmarks and serialization tests). Therefore, this commit
also ensures that every validator has a benchmark.
2026-01-19 11:04:35 +00:00
Henrique Moody
098c973a2a
Add GitHub action to lint documentation files
When we make changes to the category of a validator, it's easy to forget
to update overall list of validators. This commit a GitHub actions that
will run a console command to check if the documentation it up-to-date.

The job will fail when we need to change the document, but the console
command will fix the issues, so there isn't a lot of friction there.
2026-01-13 23:37:05 -07:00
Henrique Moody
35ea95c6f0
Remove number prefixes from Markdown files
We used to have those to preserve the order of the pages when generating
the documentation with MkDocs. This commit introduces the
`mkdocs-nav-weight`, that allows us to make that order without having
those prefixes.
2026-01-07 14:46:06 +01:00
Henrique Moody
7892a7c902
Port Bash scripts to PHP
It makes more sense to use PHP to generate PHP code than to use Bash. I
love writing Bash scripts, but I know it's not for everyone, and they
can become quite complex. Porting them to PHP code also lowers the
barrier for people to change them.

While I was making those changes, I also noticed a problem with how we
save the domain suffixes. We're converting all of them to ASCII, so we
are not preserving languages such as Chinese, Thai, and Hebrew, which
use non-ASCII characters.
2026-01-06 10:06:22 +01:00