- Parce PSL ICANN section into structured sections (rules,
wildcards, exceptions) according to the format.
- Updates PublicSuffix semantics for complete application of
the rules.
- Includes private domain suffixes now.
- Refreshes the existing data.
- Fixes the update-regionals.yml workflow, set it to run
twice a week.
References: https://github.com/publicsuffix/list/wiki/Format#format
Some systems and tools (e.g., certain archive extractors, Windows
environments, or CI pipelines) do not properly handle non-ASCII
characters in file paths. The public suffix data files for
internationalized TLDs (such as ישראל, СРБ, 香港, and ไทย) were stored
using their native Unicode names, which caused installation failures
on those systems.
This commit converts those filenames to their Punycode equivalents
(e.g., XN--4DBRK0CE.php instead of ישראל.php) using `idn_to_ascii()`.
Both the data generation command (`UpdateDomainSuffixesCommand`) and the
runtime validator (`PublicDomainSuffix`) are updated to use the same
Punycode-based file lookup, ensuring consistency. A polyfill dependency
(`symfony/polyfill-intl-idn`) is added so that `idn_to_ascii()` is
available even when the `intl` PHP extension is not installed.
Assisted-by: Claude Code (Claude Opus 4.6)
Co-authored-by: Henrique Moody <henriquemoody@gmail.com>
This commit introduces REUSE compliance by annotating all files
with SPDX information and placing the reused licences in the
LICENSES folder.
We additionally removed the docheader tool which is made obsolete
by this change.
The main LICENSE and copyright text of the project is now not under
my personal name anymore, and it belongs to "The Respect Project
Contributors" instead.
This change restores author names to several files, giving the
appropriate attribution for contributions.
With this change, we won't need to change the `PostalCode` validator
every time there's an update. While I was moving the data, I noticed
some inefficiencies with the regular expressions, so I made some
changes.
Keeping the list of ISO 3166-2 up-to-date requires some maintenance. At
the same time, PHP ISO Codes maintains an up-to-date database with even
more ISO codes we could use in this library.
This change doesn't fully use all resources of PHP ISO Codes, but it
already removes some unnecessary code from the repository.
We've already used this library, but it was heavy because it included
all the localizations in it. Now, the package is much smaller (5.0M).
Signed-off-by: Henrique Moody <henriquemoody@gmail.com>
Previously, we were loading country info from a JSON file. This
changes it to use PHP files instead. It also caches these resources
across calls avoiding these files to be loaded more than once
per process.
Inside the "data/" directory, we have files with lists of subdivisions
that need to be updated. We have to update them manually, or we automate
that task with a script and GitHub actions.
The two options are very time consuming and also not ideal. We don't
want to deal with that problem and, thinking that the user of this
library may want to show the data that we validate, we should create a
whole library to make it more usable.
The "sokil/php-isocodes" is a simple library that, even supports
translations. It's frequently updated and has gone to major performance
updates.
I am not fond of the idea of requiring an external library to install
Validation, as I have seen that gone wrong before [1]. Ideally, that
would be an optional dependency for people who would like to use those
rules, but to make that happen, we need to release a MAJOR version.
[1]: d072b4de6a
Signed-off-by: Henrique Moody <henriquemoody@gmail.com>
There is no much benefit from having individual rules for each country's
subdivision, quite the opposite. It increases the amount of code and
makes it hard to change the implementation of these rules. Right now,
the only sane way to change those rules is with a customized script.
This commit will remove the Subdivision Code rules per country and
instead will put that information into JSON files.
We both wouldn't like to keep this in this library anymore, and we are
considering having another library to deal with this data [1], but since
it seems like it may take some time, looks better to do it temporarily
here.
[1]: https://github.com/sokil/php-isocodes/issues/12
Co-authored-by: Mazen Touati <mazen_touati@hotmail.com>
Signed-off-by: Henrique Moody <henriquemoody@gmail.com>