Keep the www\d* domain for some stories

When extracting story domain, keep the facing `www\d*` domain in cases
where it's the only non top level domain (e.g. www31337.tld).
This commit is contained in:
Basilio the cat 2015-08-09 22:21:22 +02:00
parent ed021d4975
commit 3743a41868

View file

@ -258,10 +258,11 @@ class Story < ActiveRecord::Base
else
# URI.parse is not very lenient, so we can't use it
self.url.
gsub(/^[^:]+:\/\//, ""). # proto
gsub(/\/.*/, ""). # path
gsub(/:\d+$/, ""). # possible port
gsub(/^www\d*\./, "") # possible "www3." in host
gsub(/^[^:]+:\/\//, ""). # proto
gsub(/\/.*/, ""). # path
gsub(/:\d+$/, ""). # possible port
gsub(/^www\d*\.(.+\..+)/, '\1') # possible "www3." in host unless
# it's the only non-TLD
end
end