Exercise Set II - NumPy
Jupyter Google Colab 💡 Show Hints ℹ️ Show Solutions
Exercise 1
We start of by practicing array creation and some basic numpy operations.
Part A: Basics
Create the following one-dimensional NumPy array of integers: array_1 = np.array([23, 12, 23, 6, 32, 78, 3, 8, 4, 223, 12, 56, 78, 324])
. Write some Python code to answer the following questions:
Get the \(4\) largest values of
array_1
.Compute
array_1
to the power of \(2\) using \(3\) different methods.Given
array_1
, negate all elements which are between \(23\) and \(84\), in place.Find indices of elements equal to \(12\) from
array_1
.Convert the one-dimensional array
array_1
to a two-dimensional array with \(7\) rows.
Use
np.sort()
with array slicing for finding largest values.For powers, try element-wise operations (
**
), NumPy functions (np.power()
), and custom loops.Boolean indexing with
&
combines conditions for filtering. Usenp.where()
ornp.nonzero()
to find indices.The
.reshape()
method changes array dimensions.
(a)
(b)
(c)
(d)
(e)
Part B: Claims Example
Now, let’s create NumPy arrays to represent actuarial data and perform basic operations.
Create a NumPy array representing annual claim amounts for a portfolio:
[12500, 8900, 15600, 22000, 5400, 18900, 31200]
Create a 2D array representing loss ratios for 4 policies over 3 years:
Policy 1: [0.68, 0.72, 0.65] Policy 2: [0.89, 0.91, 0.85] Policy 3: [0.45, 0.52, 0.48] Policy 4: [0.78, 0.83, 0.76]
Use NumPy functions to create:
- An array of 1000 zeros (for initializing claim counts)
- An array of 500 ones multiplied by 1200 (base premiums)
- An array of integers from 18 to 65 (insurable ages)
- 10 evenly spaced discount factors between 1.0 and 0.5
Use
np.array()
to convert Python lists to NumPy arraysFor the 2D array, create a list of lists where each inner list represents one policy
For creating special arrays, look at:
np.zeros()
,np.ones()
,np.arange()
,np.linspace()
Remember that
np.arange(start, stop)
doesn’t include the stop valueYou can multiply arrays by scalars:
array * 1200
Part C: Array Attributes and info
Using the arrays created above:
Print the shape, number of dimensions, and total number of elements for the loss ratios array
Calculate the memory usage of the claim amounts array
Check and explain the data types of all arrays
Array attributes you need:
.shape
,.ndim
,.size
,.dtype
For memory usage:
.itemsize
(bytes per element),.nbytes
(total bytes)Data types depend on the input: integers become
int64
, decimals becomefloat64
Arrays created with
np.zeros()
default tofloat64
,np.ones()
also defaults tofloat64
Exercise 2
In this exercise, we’ll practice indexing, slicing, and boolean operations.
Part A: Functions
Write a Python function to compute averages using a sliding window over an array. That is, your function will take two arguments (or inputs): (i) an array and (ii) a sliding window size. For example, if you provide your function with the array array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
and the sliding window size \(2\) as inputs, then your function should return as output the array array([0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5])
.
Now, let’s practice accessing and filtering actuarial data using various indexing techniques.
Part B: Indexing and Slicing
Using the loss ratios array from Exercise 1 parts B and C:
Access the loss ratio for Policy 2 in Year 3
Get all loss ratios for Policy 1 across all years
Get loss ratios for all policies in Year 2
Extract a subset containing the first 2 policies and first 2 years
Get every other policy (1st, 3rd, etc.)
NumPy indexing:
array[row_index, column_index]
Use
:
to select all elements along an axis:array[:, 1]
gets all rows, column 1Slicing syntax:
start:stop:step
For subsets:
array[0:2, 0:2]
gets first 2 rows and first 2 columnsStep slicing:
array[::2, :]
gets every other row (step=2)
Part C: Booleans
Find all loss ratios greater than 0.80
Identify which policies have any year with loss ratio > 0.80
Count how many loss ratios are between 0.60 and 0.80
Create a “high risk” mask for policies with average loss ratio > 0.75
Boolean operations:
>
,<
,>=
,<=
,==
Combine conditions with
&
(and) and|
(or):(condition1) & (condition2)
Use parentheses around each condition when combining
np.any(array, axis=1)
checks if any element is True along rowsnp.where(condition)
returns indices where condition is Truenp.sum(boolean_array)
counts True values (True=1, False=0)Average calculation: \(\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i\)
np.mean(array, axis=1)
calculates average along rows (for each policy)Risk assessment: Policies with high loss ratios indicate higher risk
Exercise 3
Now, let’s apply NumPy’s mathematical functions to actuarial calculations.
Part B: Statistical Analysis
Using the claim amounts from Exercise 1:
Calculate mean, median, standard deviation, minimum, and maximum
Find the 25th, 50th, and 75th percentiles
Calculate the coefficient of variation (std/mean)
Identify claims that are more than 1 standard deviation above the mean
Basic statistics:
np.mean()
,np.median()
,np.std()
,np.min()
,np.max()
Percentiles:
np.percentile(array, [25, 50, 75])
gives 25th, 50th, 75th percentilesCoefficient of variation: \(CV = \frac{\sigma}{\mu}\) (standard deviation / mean)
- Measures relative variability; useful for comparing risk across different claim sizes
Outlier detection: Values beyond \(\mu \pm k\sigma\) (often \(k=1, 2, or 3\))
- For this exercise: threshold = \(\mu + \sigma\) (mean + 1 standard deviation)
len(array)
gives the number of elementsBoolean indexing:
array[array > threshold]
selects elements meeting condition
Exercise 4
In this part, we’ll focus on a a practical actuarial application by creating a simple life insurance premium calculation system.
You need to calculate annual premiums for term life insurance policies. Given:
Face amounts: [$100,000, $250,000, $500,000, $1,000,000]
Ages: [25, 35, 45, 55]
Base mortality rates: [0.001, 0.002, 0.005, 0.012] (annual probability of death)
Interest rate: 3% annually
Policy term: 20 years
Loading factor: 20% (to cover expenses and profit)
Now, given the information above, do the following:
Create arrays for all the given data
Calculate the expected present value of death benefits for each policy
Calculate the expected present value of a $1 annuity for 20 years at each age
Calculate the net annual premium (expected PV of benefits / expected PV of annuity)
Calculate the gross annual premium (net premium × loading factor)
Create a summary table showing age, face amount, net premium, and gross premium
Start by creating arrays for all input data using
np.array()
Expected Present Value of Death Benefits: \(EPV_{benefits} = \sum_{t=1}^{n} {}_{t-1}p_x \cdot q_{x+t-1} \cdot v^{t-0.5}\)
- Where: \({}_{t-1}p_x\) = survival probability, \(q_{x+t-1}\) = death probability, \(v^{t-0.5}\) = discount factor
Expected Present Value of Annuity: \(EPV_{annuity} = \sum_{t=1}^{n} {}_{t-1}p_x \cdot v^{t-1}\)
Net Premium Formula: \(P_{net} = \frac{Face Amount \times EPV_{benefits}}{EPV_{annuity}}\)
Gross Premium: \(P_{gross} = P_{net} \times (1 + loading)\)
Survival probability: \({}_{t-1}p_x = (1-q)^{t-1}\)
Death probability in year t: \({}_{t-1}p_x \times q\)
Present value discounting: \(v^t = (1 + i)^{-t}\) where \(i\) is interest rate
Use
np.sum()
to add up present values across all yearsBroadcasting:
face_amounts[:, np.newaxis]
creates column vector for matrix operations
Exercise 5
In this part, we’ll focus on some advanced array operations, namely reshaping, broadcasting, and advanced manipulations.
Part A: Portfolio Analysis
You have quarterly premium data for 3 product lines over 2 years (8 quarters total):
Product A: [100, 120, 110, 130, 125, 140, 135, 150] (in thousands)
Product B: [200, 180, 220, 210, 190, 230, 225, 240]
Product C: [50, 60, 55, 65, 58, 70, 68, 75]
Create a 2D array with products as rows and quarters as columns
Reshape the data to show annual totals (2 years × 3 products)
Calculate quarterly growth rates for each product
Find the best and worst performing quarters for each product
Create 2D array with
np.array([product_a, product_b, product_c])
For annual totals: use
.reshape(3, 2, 4)
to get (products, years, quarters)Then use
np.sum(axis=2)
to sum over quartersGrowth rate formula: \(Growth\,Rate = \frac{New\,Value - Old\,Value}{Old\,Value} \times 100\%\)
Year-over-year growth: \(YoY = \frac{Year_2 - Year_1}{Year_1} \times 100\%\)
For quarterly growth: compare each quarter to the previous one
np.argmax()
andnp.argmin()
find indices of max/min valuesConvert quarter index to readable format:
Q{(index%4)+1}Y{(index//4)+1}
Part B: Broadcasting
Apply broadcasting to calculate insurance premiums across multiple dimensions:
3 age groups: [25, 40, 55]
4 coverage amounts: [$100k, $250k, $500k, $1M]
Base rate: $2 per $1000 of coverage
Age multipliers: [0.8, 1.0, 1.5]
Calculate the premium matrix showing all combinations.
Premium calculation formula: \(Premium = \frac{Coverage}{1000} \times Base\,Rate \times Age\,Multiplier\)
Broadcasting principle: Arrays with compatible shapes can be operated on together
- Compatible shapes: (3,1) and (1,4) → result (3,4)
Reshape arrays:
ages.reshape(-1, 1)
makes column vector,coverage.reshape(1, -1)
makes row vectorBase premium calculation:
(coverage / 1000) * base_rate
Broadcasting automatically expands arrays to compatible shapes
Final operation:
base_premiums * age_multipliers_column
Result will be 3×4 matrix (3 ages × 4 coverage amounts)
Use nested loops and f-string formatting for nice table output
Summary
These exercises covered the essential NumPy skills for actuarial science:
- Array Creation: Using appropriate data structures for actuarial data
- Indexing & Slicing: Accessing specific policies, years, or conditions
- Boolean Operations: Filtering high-risk policies and claims
- Mathematical Operations: Premium calculations and present value computations
- Statistical Analysis: Risk assessment and portfolio analysis
- Advanced Operations: Reshaping data and using broadcasting for multi-dimensional calculations