vault backup: 2024-11-10 17:02:50

This commit is contained in:
2024-11-10 17:02:50 +08:00
parent 5d7081dc35
commit cbd13ff74b
5 changed files with 184 additions and 0 deletions

View File

@@ -0,0 +1,34 @@
---
tags:
aliases:
date: 2024-11-10
time: 16:58:43
description:
---
**可以用來代替[Matplotlib](https://matplotlib.org/)**
Yes, `Matplotlib` is classic-its virtually the standard to go to when it comes to visualizing data in Python. But to be frank, it feels so much like trying to use an axe for delicate brain surgery, and its syntax? A little verbose, if were being honest. If youre not creating highly customized visualizations, there are better options with a more straightforward syntax.
## Why [Matplotlib](https://matplotlib.org/) is Overrated:
**Clunky syntax**: Even simple charts take an amazingly large number of lines to plot sometimes.
**Outdated default style:** The default style is configurable, but it isnt exactly inspiring-or, for that matter, particularly readable.
## What You Should Replace It With: Plotly
Where visualization cleanliness and interactivity matter, and definitely dont want a pile of code, `Plotly` is great. This is especially useful when you have to share visuals fast or within presentations on the web.
```python
import plotly.express as px
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length", color="species")
fig.show()
```
With `Ploty`, you immediately get interactive graphs with great default visuals. The code is more concise and, by default, includes things like tooltips and zooming.
# 參考來源
- [5 Overrated Python Libraries (And What You Should Use Instead) | by Abdur Rahman | Nov, 2024 | Python in Plain English](https://python.plainenglish.io/5-overrated-python-libraries-and-what-you-should-use-instead-106bd9ded180)

View File

@@ -0,0 +1,34 @@
---
tags:
aliases:
date: 2024-11-10
time: 16:57:31
description:
---
**可以用來代替[pandas](https://pandas.pydata.org/)**
Now, listen up-the thing is, `Pandas` is great at data exploration and for middle-sized datasets. But people just use it for everything, like its some magic solution thats going to solve every problem in data, and quite frankly, it isnt. Working with `Pandas` on huge datasets can turn your machine into a sputtering fan engine, and memory overhead just doesnt make sense for some workflows.
## **Why [pandas](https://pandas.pydata.org/) Is Overrated:**
**Memory Usage:** As `Pandas` operates mainly in-memory, any operation on a large dataset will badly hit performance.
**Limited Scalability:** Scaling with `Pandas` isnt easy. It was never designed for big data.
## What You Should Use Instead: Polars
`Polars` is an ultra-fast DataFrame library in Rust using Apache Arrow. Optimized for memory efficiency and multithreaded performance, this makes it perfect for when you want to crunch data without heating up your CPU.
```python
import polars as pl
df = pl.read_csv("big_data.csv")
filtered_df = df.filter(pl.col("value") > 50)
print(filtered_df)
```
**Why** `**Polars**`**?** It will process data that would bring `Pandas` to its knees, and it handles operations in a fraction of the time. Besides that, it also has lazy evaluation-meaning it is only computing whats needed.
# 參考來源
- [5 Overrated Python Libraries (And What You Should Use Instead) | by Abdur Rahman | Nov, 2024 | Python in Plain English](https://python.plainenglish.io/5-overrated-python-libraries-and-what-you-should-use-instead-106bd9ded180)

View File

@@ -0,0 +1,45 @@
---
tags:
aliases:
date: 2024-11-10
time: 17:00:12
description:
---
**可以用來代替[scikit-learn](https://scikit-learn.org/stable/)**
I know, `Scikit-Learn` isnt supposed to be a deep learning library, but people use it as if it were. It is incredibly handy at quick prototyping and traditional machine learning models, but when it comes to neural networks, its just not in the same league as a library designed with tensors in mind.
## Why [scikit-learn](https://scikit-learn.org/stable/) is Overrated:
**No GPU Support:** Deep learning can be life-changing when training on GPUs. However, this is something that is not supported in `Scikit-Learn`.
**Not Optimized for Neural Networks:** `Scikit-learn` wasnt designed for doing deep learning; using it this way is reactively assured poor results.
## What You Should Use Instead: PyTorch
`PyTorch` is more general and supports GPU. Hence, its perfect for deep learning projects. Its Pythonic-this means for one coming from `Scikit-Learn`, it will feel natural, but with much more power.
import torch
import torch.nn as nn
import torch.optim as optim
# Define a simple model
```python
model = nn.Sequential(
nn.Linear(10, 5),
nn.ReLU(),
nn.Linear(5, 2)
)
```
# Define optimizer and loss
```python
optimizer = optim.SGD(model.parameters(), lr=0.01)
loss_fn = nn.CrossEntropyLoss()
```
If youre serious about deep learning, youll want to use a library worked out for the task at hand-which will save you from such limitations and inefficiencies. You will fine tune models with `PyTorch` and leverage the GPUs to your hearts content.
# 參考來源
- [5 Overrated Python Libraries (And What You Should Use Instead) | by Abdur Rahman | Nov, 2024 | Python in Plain English](https://python.plainenglish.io/5-overrated-python-libraries-and-what-you-should-use-instead-106bd9ded180)

View File

@@ -0,0 +1,36 @@
---
tags:
aliases:
date: 2024-11-10
time: 16:54:12
description:
---
**可以用來代替[requests](https://pypi.org/project/requests/)**
## **Why [requests](https://pypi.org/project/requests/) is Overrated:**
**Blocking IO:** `Requests` is synchronous, which means each call waits for the previous call to finish. This is less than ideal when working with I/O-bound programs.
**Heavy:** Its got loads of convenience baked in, but it does have a cost in terms of speed and memory footprint. Not a big deal on a simple script, but on larger systems this can be a resource hog.
## **What You Should Instead Use:** `httpx`
For parallel processing of requests, `httpx`provides a similar API but with asynchronous support. So, if you make many API calls, itll save you some time and resources because it will process those requests concurrently.
```python
import httpx
async def fetch_data(url):
async with httpx.AsyncClient() as client:
response = await client.get(url)
return response.json()
# Simple and non-blocking
data = fetch_data("https://api.example.com/data")
```
> **Pro Tip:** Asynchronous requests can reduce the processing time by a great amount if the task at hand is web scraping or ingesting data from somewhere.
# 參考來源
- [5 Overrated Python Libraries (And What You Should Use Instead) | by Abdur Rahman | Nov, 2024 | Python in Plain English](https://python.plainenglish.io/5-overrated-python-libraries-and-what-you-should-use-instead-106bd9ded180)

View File

@@ -0,0 +1,35 @@
---
tags:
aliases:
date: 2024-11-10
time: 16:55:41
description:
---
**可以用來代替[Beautiful Soup Documentation](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)**
## **Why [Beautiful Soup Documentation](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) is Overrated:**
**Speed:** Not very fast, when the size of a document is very big.
**Thread blocking:** Much like `Requests` itself, it is not designed with async in mind, which certainly makes it ill-suited for scraping dynamic websites.
## **Instead What you should use:** `selectolax`
`selectolax` is a less famous library that uses `libxml2` for better performance and with less memory consumption.
```python
from selectolax.parser import HTMLParser
html_content = "<html><body><p>Test</p></body></html>"
tree = HTMLParser(html_content)
text = tree.css("p")[0].text()
print(text) # Output: Test
```
As it will turn out, by using `Selectolax`, you retain the same HTML parsing capabilities but with much-enhanced speed, making it ideal for web scraping tasks that are quite data-intensive.
> **“Do not fall in love with the tool; rather, fall in love with the outcome.” Choosing the proper tool is half the battle.**
# 參考來源
- [5 Overrated Python Libraries (And What You Should Use Instead) | by Abdur Rahman | Nov, 2024 | Python in Plain English](https://python.plainenglish.io/5-overrated-python-libraries-and-what-you-should-use-instead-106bd9ded180)