You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
53 lines
2.3 KiB
53 lines
2.3 KiB
Metadata-Version: 2.1
|
|
Name: tldextract
|
|
Version: 3.4.4
|
|
Summary: Accurately separates a URL's subdomain, domain, and public suffix, using the Public Suffix List (PSL). By default, this includes the public ICANN TLDs and their exceptions. You can optionally support the Public Suffix List's private domains as well.
|
|
Home-page: https://github.com/john-kurkowski/tldextract
|
|
Author: John Kurkowski
|
|
Author-email: john.kurkowski@gmail.com
|
|
License: BSD-3-Clause
|
|
Keywords: tld domain subdomain url parse extract urlparse urlsplit public suffix list publicsuffix publicsuffixlist
|
|
Classifier: Development Status :: 5 - Production/Stable
|
|
Classifier: Topic :: Utilities
|
|
Classifier: License :: OSI Approved :: BSD License
|
|
Classifier: Programming Language :: Python :: 3
|
|
Classifier: Programming Language :: Python :: 3.7
|
|
Classifier: Programming Language :: Python :: 3.8
|
|
Classifier: Programming Language :: Python :: 3.9
|
|
Classifier: Programming Language :: Python :: 3.10
|
|
Classifier: Programming Language :: Python :: 3.11
|
|
Requires-Python: >=3.7
|
|
Description-Content-Type: text/markdown
|
|
License-File: LICENSE
|
|
Requires-Dist: idna
|
|
Requires-Dist: requests (>=2.1.0)
|
|
Requires-Dist: requests-file (>=1.4)
|
|
Requires-Dist: filelock (>=3.0.8)
|
|
|
|
`tldextract` accurately separates a URL's subdomain, domain, and public suffix.
|
|
|
|
It does this via the Public Suffix List (PSL).
|
|
|
|
>>> import tldextract
|
|
>>> tldextract.extract('http://forums.news.cnn.com/')
|
|
ExtractResult(subdomain='forums.news', domain='cnn', suffix='com')
|
|
>>> tldextract.extract('http://forums.bbc.co.uk/') # United Kingdom
|
|
ExtractResult(subdomain='forums', domain='bbc', suffix='co.uk')
|
|
>>> tldextract.extract('http://www.worldbank.org.kg/') # Kyrgyzstan
|
|
ExtractResult(subdomain='www', domain='worldbank', suffix='org.kg')
|
|
|
|
`ExtractResult` is a namedtuple, so it's simple to access the parts you want.
|
|
|
|
>>> ext = tldextract.extract('http://forums.bbc.co.uk')
|
|
>>> (ext.subdomain, ext.domain, ext.suffix)
|
|
('forums', 'bbc', 'co.uk')
|
|
>>> # rejoin subdomain and domain
|
|
>>> '.'.join(ext[:2])
|
|
'forums.bbc'
|
|
>>> # a common alias
|
|
>>> ext.registered_domain
|
|
'bbc.co.uk'
|
|
|
|
By default, this package supports the public ICANN TLDs and their exceptions.
|
|
You can optionally support the Public Suffix List's private domains as well.
|