Commit graph

20 commits

Author SHA1 Message Date
The Respect Panda
c0c506692b
Update Regional Information 2026-03-02 10:46:21 -03:00
The Respect Panda
e924e39940 Update Regional Information 2026-02-26 00:24:14 +00:00
Alexandre Gomes Gaigalas
7c8ecfa317 Fix PublicSuffix validator and UpdateDomainSuffixesCommand
- Parce PSL ICANN section into structured sections (rules,
   wildcards, exceptions) according to the format.
 - Updates PublicSuffix semantics for complete application of
   the rules.
 - Includes private domain suffixes now.
 - Refreshes the existing data.
 - Fixes the update-regionals.yml workflow, set it to run
   twice a week.

References: https://github.com/publicsuffix/list/wiki/Format#format
2026-02-23 12:18:57 +00:00
copilot-swe-agent[bot]
eedce8fb32 Use Punycode filenames for non-ASCII TLD suffix data files
Some systems and tools (e.g., certain archive extractors, Windows
environments, or CI pipelines) do not properly handle non-ASCII
characters in file paths. The public suffix data files for
internationalized TLDs (such as ישראל, СРБ, 香港, and ไทย) were stored
using their native Unicode names, which caused installation failures
on those systems.

This commit converts those filenames to their Punycode equivalents
(e.g., XN--4DBRK0CE.php instead of ישראל.php) using `idn_to_ascii()`.
Both the data generation command (`UpdateDomainSuffixesCommand`) and the
runtime validator (`PublicDomainSuffix`) are updated to use the same
Punycode-based file lookup, ensuring consistency. A polyfill dependency
(`symfony/polyfill-intl-idn`) is added so that `idn_to_ascii()` is
available even when the `intl` PHP extension is not installed.

Assisted-by: Claude Code (Claude Opus 4.6)
Co-authored-by: Henrique Moody <henriquemoody@gmail.com>
2026-02-09 17:34:56 +01:00
The Respect Panda
74299fded1
Update Regional Information 2026-02-09 13:10:54 +01:00
Alexandre Gomes Gaigalas
d9cdc118b2 Introduce REUSE compliance
This commit introduces REUSE compliance by annotating all files
with SPDX information and placing the reused licences in the
LICENSES folder.

We additionally removed the docheader tool which is made obsolete
by this change.

The main LICENSE and copyright text of the project is now not under
my personal name anymore, and it belongs to "The Respect Project
Contributors" instead.

This change restores author names to several files, giving the
appropriate attribution for contributions.
2026-01-21 06:28:11 +00:00
Henrique Moody
f635cc748f
Update regional information
I manually run the commands that update the data we use in `Tld`,
`PostalCode`, and `PublicDomainSuffix`.
2026-01-07 09:48:50 +01:00
Henrique Moody
6dfad94985
Move postal code regexes to the data/ directory
With this change, we won't need to change the `PostalCode` validator
every time there's an update. While I was moving the data, I noticed
some inefficiencies with the regular expressions, so I made some
changes.
2026-01-06 10:06:22 +01:00
Henrique Moody
f171c4725a
Move list of TLD to the data/ directory
With that change, we won't need to change the Tld validator every time
there's an update.
2026-01-06 09:42:29 +01:00
The Respect Panda
ae7a20f6d3
Update Regional Information 2024-03-25 18:43:50 +01:00
Henrique Moody
04b2722d02
Remove ISO 3166-2 data in favor of PHP ISO codes
Keeping the list of ISO 3166-2 up-to-date requires some maintenance. At
the same time, PHP ISO Codes maintains an up-to-date database with even
more ISO codes we could use in this library.

This change doesn't fully use all resources of PHP ISO Codes, but it
already removes some unnecessary code from the repository.

We've already used this library, but it was heavy because it included
all the localizations in it. Now, the package is much smaller (5.0M).

Signed-off-by: Henrique Moody <henriquemoody@gmail.com>
2024-02-13 21:53:46 +01:00
The Respect Panda
703f610ee8 Update Regional Information 2023-06-13 21:46:15 -03:00
Alexandre Gomes Gaigalas
7c28d2c1f4 Update sorting order on public suffix data 2023-02-19 00:44:41 -03:00
Alexandre Gomes Gaigalas
e2b6138bf6 Add PublicDomainSuffix Rule
- List will be auto-updated from https://publicsuffix.org/list/public_suffix_list.dat
 - Updated AbstractSearcher rules to be case insensitive
 - Updated PR creator bots
 - Docs and tests
2023-02-19 00:19:10 -03:00
Alexandre Gomes Gaigalas
6173757f63 Use PHP files and setup a runtime cache for CountryInfo
Previously, we were loading country info from a JSON file. This
changes it to use PHP files instead. It also caches these resources
across calls avoiding these files to be loaded more than once
per process.
2023-02-19 00:19:10 -03:00
Alexandre Gomes Gaigalas
74dee73f65 Update updater workflows, remove countries outside ISO-3366-2, cleanup 2023-02-15 00:29:10 -03:00
Alexandre Gomes Gaigalas
1e2f75287c Change update_subdivision_codes to work with salsa/iso-codes, updates data 2023-02-15 00:01:32 -03:00
Henrique Moody
4c21a7ffc9
Revert "Use "sokil/php-isocodes" on SubdivisionCode"
This reverts commit 9c9c76ebfb.
2021-03-19 15:12:45 +01:00
Henrique Moody
9c9c76ebfb
Use "sokil/php-isocodes" on SubdivisionCode
Inside the "data/" directory, we have files with lists of subdivisions
that need to be updated. We have to update them manually, or we automate
that task with a script and GitHub actions.

The two options are very time consuming and also not ideal. We don't
want to deal with that problem and, thinking that the user of this
library may want to show the data that we validate, we should create a
whole library to make it more usable.

The "sokil/php-isocodes" is a simple library that, even supports
translations. It's frequently updated and has gone to major performance
updates.

I am not fond of the idea of requiring an external library to install
Validation, as I have seen that gone wrong before [1]. Ideally, that
would be an optional dependency for people who would like to use those
rules, but to make that happen, we need to release a MAJOR version.

[1]: d072b4de6a

Signed-off-by: Henrique Moody <henriquemoody@gmail.com>
2021-02-06 15:09:04 +01:00
Henrique Moody
718bacad04
Remove subdivision code rules per country
There is no much benefit from having individual rules for each country's
subdivision, quite the opposite. It increases the amount of code and
makes it hard to change the implementation of these rules. Right now,
the only sane way to change those rules is with a customized script.

This commit will remove the Subdivision Code rules per country and
instead will put that information into JSON files.

We both wouldn't like to keep this in this library anymore, and we are
considering having another library to deal with this data [1], but since
it seems like it may take some time, looks better to do it temporarily
here.

[1]: https://github.com/sokil/php-isocodes/issues/12

Co-authored-by: Mazen Touati <mazen_touati@hotmail.com>
Signed-off-by: Henrique Moody <henriquemoody@gmail.com>
2019-04-06 23:05:24 +02:00