--- tags: aliases: date: 2024-11-10 time: 16:55:41 description: --- **可以用來代替[Beautiful Soup Documentation](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)** ## **Why [Beautiful Soup Documentation](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) is Overrated:** **Speed:** Not very fast, when the size of a document is very big. **Thread blocking:** Much like `Requests` itself, it is not designed with async in mind, which certainly makes it ill-suited for scraping dynamic websites. ## **Instead What you should use:** `selectolax` `selectolax` is a less famous library that uses `libxml2` for better performance and with less memory consumption. ```python from selectolax.parser import HTMLParser html_content = "

Test

" tree = HTMLParser(html_content) text = tree.css("p")[0].text() print(text) # Output: Test ``` As it will turn out, by using `Selectolax`, you retain the same HTML parsing capabilities but with much-enhanced speed, making it ideal for web scraping tasks that are quite data-intensive. > **“Do not fall in love with the tool; rather, fall in love with the outcome.” Choosing the proper tool is half the battle.** # 參考來源 - [5 Overrated Python Libraries (And What You Should Use Instead) | by Abdur Rahman | Nov, 2024 | Python in Plain English](https://python.plainenglish.io/5-overrated-python-libraries-and-what-you-should-use-instead-106bd9ded180)