SSWebTechIO

NumPy Fundamentals (Array Basics)

NumPy is used for fast numerical computing.
Main object: ndarray (n-dimensional array).
Arrays are faster than Python lists for math operations.

import numpy as np

a = np.array([1, 2, 3, 4])
b = np.array([[1, 2], [3, 4]])

print(a)
print(b)
print(a.shape)     # (4,)
print(b.shape)     # (2, 2)
print(a.dtype)

NumPy Array Creation (zeros, ones, arange, linspace)

zeros(): array filled with 0
ones(): array filled with 1
arange(): range like list but returns array
linspace(): equally spaced values

import numpy as np

z = np.zeros((2, 3))
o = np.ones((2, 3))
r = np.arange(0, 10, 2)
l = np.linspace(0, 1, 5)

print(z)
print(o)
print(r)
print(l)

NumPy Mathematical Operations (Element-wise + Broadcasting)

Operations are element-wise: +, -, *, /, **
Broadcasting allows operations between different shapes.

import numpy as np

a = np.array([1, 2, 3])
b = np.array([10, 20, 30])

print(a + b)
print(a * b)
print(a ** 2)

# Broadcasting
c = np.array([1, 2, 3])
print(c + 5)        # adds 5 to each element

NumPy Aggregation (sum, mean, max, min)

Aggregation functions: sum(), mean(), max(), min(), std()
Use axis=0 (column-wise) and axis=1 (row-wise).

import numpy as np

m = np.array([[1, 2, 3],
              [4, 5, 6]])

print(m.sum())
print(m.mean())
print(m.max())

print(m.sum(axis=0))   # column sums
print(m.sum(axis=1))   # row sums

Pandas Series (Basics)

Series is a 1D labeled array.
Index labels can be auto or custom.

import pandas as pd

s1 = pd.Series([10, 20, 30])
s2 = pd.Series([80, 90, 85], index=["A", "B", "C"])

print(s1)
print(s2)
print(s2["B"])

Pandas DataFrame (Basics)

DataFrame is a 2D table (rows and columns).
Can be created from dictionary, list of dicts, or CSV.

import pandas as pd

data = {
    "Name": ["Sourav", "Amit", "Rita"],
    "Age": [25, 22, 23],
    "Marks": [80, 90, 85]
}

df = pd.DataFrame(data)
print(df)
print(df["Name"])
print(df.head())

Data Cleaning (Missing Values + Drop + Fill)

Check missing: isna(), isnull()
Remove missing: dropna()
Fill missing: fillna() (mean/0/value)

import pandas as pd
import numpy as np

df = pd.DataFrame({
    "Name": ["A", "B", "C"],
    "Marks": [80, np.nan, 90]
})

print(df.isna())

df2 = df.fillna(0)
print(df2)

df3 = df.dropna()
print(df3)

Data Cleaning (Duplicates + Rename + Type Conversion)

Remove duplicates: drop_duplicates()
Rename columns: rename()
Convert types: astype()

import pandas as pd

df = pd.DataFrame({
    "Name": ["A", "A", "B"],
    "Marks": ["80", "80", "90"]
})

df = df.drop_duplicates()
df = df.rename(columns={"Marks": "Score"})
df["Score"] = df["Score"].astype(int)

print(df)

Aggregation and GroupBy (sum, mean, count)

groupby() is used to aggregate data category-wise.
Common aggregates: sum(), mean(), count(), min(), max().

import pandas as pd

df = pd.DataFrame({
    "Dept": ["CSE", "CSE", "IT", "IT"],
    "Marks": [80, 90, 70, 85]
})

print(df.groupby("Dept")["Marks"].mean())
print(df.groupby("Dept")["Marks"].sum())
print(df.groupby("Dept")["Marks"].count())

Matplotlib Visualization (Line and Bar)

Matplotlib is used for plotting graphs in Python.
Common plots: line, bar, scatter, histogram.

import matplotlib.pyplot as plt

x = [1, 2, 3, 4]
y = [10, 20, 15, 25]

plt.plot(x, y)
plt.title("Line Plot")
plt.xlabel("X")
plt.ylabel("Y")
plt.show()

names = ["A", "B", "C"]
marks = [80, 90, 85]

plt.bar(names, marks)
plt.title("Bar Chart")
plt.xlabel("Name")
plt.ylabel("Marks")
plt.show()

Matplotlib Visualization (Scatter and Histogram)

scatter shows relationship between two variables.
hist shows frequency distribution.

import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [2, 4, 6, 8, 10]

plt.scatter(x, y)
plt.title("Scatter Plot")
plt.xlabel("X")
plt.ylabel("Y")
plt.show()

data = [10, 12, 12, 13, 15, 15, 15, 18, 20]

plt.hist(data, bins=5)
plt.title("Histogram")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()

Working with Libraries

NumPy Fundamentals (Array Basics)

NumPy Array Creation (zeros, ones, arange, linspace)

NumPy Mathematical Operations (Element-wise + Broadcasting)

NumPy Aggregation (sum, mean, max, min)

Pandas Series (Basics)

Pandas DataFrame (Basics)

Data Cleaning (Missing Values + Drop + Fill)

Data Cleaning (Duplicates + Rename + Type Conversion)

Aggregation and GroupBy (sum, mean, count)

Matplotlib Visualization (Line and Bar)

Matplotlib Visualization (Scatter and Histogram)