Commit graph

5 commits

Author SHA1 Message Date
copilot-swe-agent[bot]
eedce8fb32 Use Punycode filenames for non-ASCII TLD suffix data files
Some systems and tools (e.g., certain archive extractors, Windows
environments, or CI pipelines) do not properly handle non-ASCII
characters in file paths. The public suffix data files for
internationalized TLDs (such as ישראל, СРБ, 香港, and ไทย) were stored
using their native Unicode names, which caused installation failures
on those systems.

This commit converts those filenames to their Punycode equivalents
(e.g., XN--4DBRK0CE.php instead of ישראל.php) using `idn_to_ascii()`.
Both the data generation command (`UpdateDomainSuffixesCommand`) and the
runtime validator (`PublicDomainSuffix`) are updated to use the same
Punycode-based file lookup, ensuring consistency. A polyfill dependency
(`symfony/polyfill-intl-idn`) is added so that `idn_to_ascii()` is
available even when the `intl` PHP extension is not installed.

Assisted-by: Claude Code (Claude Opus 4.6)
Co-authored-by: Henrique Moody <henriquemoody@gmail.com>
2026-02-09 17:34:56 +01:00
Henrique Moody
7c681fec66
Fix SPDX headers in all files
I ran the `bin/console spdx --fix` with different strategies for
different files. For most of the core classes, since they've been
drastically rebuilt, I've run it with the `git-blame` strategy, for for
the `src/Validators`, in which the API changed completely but the logic
remains the same, I use the `git-log` strategy.
2026-02-03 15:23:23 +01:00
Henrique Moody
4390e4feb6
Simplify how we load and save files in data/
We had different ways of saving and loading files from `data/`, so I decided to
unify them to simplify things. I repurposed the `DomainInfo` class and named it
`DataLoader`, so we can use the same class to load anything from the `data/`
directory.
2026-01-26 20:28:29 +01:00
Alexandre Gomes Gaigalas
d9cdc118b2 Introduce REUSE compliance
This commit introduces REUSE compliance by annotating all files
with SPDX information and placing the reused licences in the
LICENSES folder.

We additionally removed the docheader tool which is made obsolete
by this change.

The main LICENSE and copyright text of the project is now not under
my personal name anymore, and it belongs to "The Respect Project
Contributors" instead.

This change restores author names to several files, giving the
appropriate attribution for contributions.
2026-01-21 06:28:11 +00:00
Henrique Moody
7892a7c902
Port Bash scripts to PHP
It makes more sense to use PHP to generate PHP code than to use Bash. I
love writing Bash scripts, but I know it's not for everyone, and they
can become quite complex. Porting them to PHP code also lowers the
barrier for people to change them.

While I was making those changes, I also noticed a problem with how we
save the domain suffixes. We're converting all of them to ASCII, so we
are not preserving languages such as Chinese, Thai, and Hebrew, which
use non-ASCII characters.
2026-01-06 10:06:22 +01:00