Keep the www\d* domain for some stories
When extracting story domain, keep the facing `www\d*` domain in cases where it's the only non top level domain (e.g. www31337.tld).
This commit is contained in:
parent
ed021d4975
commit
3743a41868
|
@ -258,10 +258,11 @@ class Story < ActiveRecord::Base
|
|||
else
|
||||
# URI.parse is not very lenient, so we can't use it
|
||||
self.url.
|
||||
gsub(/^[^:]+:\/\//, ""). # proto
|
||||
gsub(/\/.*/, ""). # path
|
||||
gsub(/:\d+$/, ""). # possible port
|
||||
gsub(/^www\d*\./, "") # possible "www3." in host
|
||||
gsub(/^[^:]+:\/\//, ""). # proto
|
||||
gsub(/\/.*/, ""). # path
|
||||
gsub(/:\d+$/, ""). # possible port
|
||||
gsub(/^www\d*\.(.+\..+)/, '\1') # possible "www3." in host unless
|
||||
# it's the only non-TLD
|
||||
end
|
||||
end
|
||||
|
||||
|
|
Loading…
Reference in a new issue