Exercise Set II - NumPy

Jupyter Google Colab 💡 Show Hints ℹ️ Show Solutions

Exercise 1

We start of by practicing array creation and some basic numpy operations.

Part A: Basics

Create the following one-dimensional NumPy array of integers: array_1 = np.array([23, 12, 23, 6, 32, 78, 3, 8, 4, 223, 12, 56, 78, 324]). Write some Python code to answer the following questions:

Get the $4$ largest values of array_1.
Compute array_1 to the power of $2$ using $3$ different methods.
Given array_1, negate all elements which are between $23$ and $84$, in place.
Find indices of elements equal to $12$ from array_1.
Convert the one-dimensional array array_1 to a two-dimensional array with $7$ rows.

Hint

Use np.sort() with array slicing for finding largest values.
For powers, try element-wise operations (**), NumPy functions (np.power()), and custom loops.
Boolean indexing with & combines conditions for filtering. Use np.where() or np.nonzero() to find indices.
The .reshape() method changes array dimensions.

Solution

(a)

(b)

(c)

(d)

(e)

Part B: Claims Example

Now, let’s create NumPy arrays to represent actuarial data and perform basic operations.

Create a NumPy array representing annual claim amounts for a portfolio: [12500, 8900, 15600, 22000, 5400, 18900, 31200]

Create a 2D array representing loss ratios for 4 policies over 3 years:

Policy 1: [0.68, 0.72, 0.65]
Policy 2: [0.89, 0.91, 0.85] 
Policy 3: [0.45, 0.52, 0.48]
Policy 4: [0.78, 0.83, 0.76]

Use NumPy functions to create:
- An array of 1000 zeros (for initializing claim counts)
- An array of 500 ones multiplied by 1200 (base premiums)
- An array of integers from 18 to 65 (insurable ages)
- 10 evenly spaced discount factors between 1.0 and 0.5

Hint

Use np.array() to convert Python lists to NumPy arrays
For the 2D array, create a list of lists where each inner list represents one policy
For creating special arrays, look at: np.zeros(), np.ones(), np.arange(), np.linspace()
Remember that np.arange(start, stop) doesn’t include the stop value
You can multiply arrays by scalars: array * 1200

Solution

Part C: Array Attributes and info

Using the arrays created above:

Print the shape, number of dimensions, and total number of elements for the loss ratios array
Calculate the memory usage of the claim amounts array
Check and explain the data types of all arrays

Hint

Array attributes you need: .shape, .ndim, .size, .dtype
For memory usage: .itemsize (bytes per element), .nbytes (total bytes)
Data types depend on the input: integers become int64, decimals become float64
Arrays created with np.zeros() default to float64, np.ones() also defaults to float64

Solution

Exercise 2

In this exercise, we’ll practice indexing, slicing, and boolean operations.

Part A: Functions

Write a Python function to compute averages using a sliding window over an array. That is, your function will take two arguments (or inputs): (i) an array and (ii) a sliding window size. For example, if you provide your function with the array array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]) and the sliding window size $2$ as inputs, then your function should return as output the array array([0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5]).

Solution

Now, let’s practice accessing and filtering actuarial data using various indexing techniques.

Part B: Indexing and Slicing

Using the loss ratios array from Exercise 1 parts B and C:

Access the loss ratio for Policy 2 in Year 3
Get all loss ratios for Policy 1 across all years
Get loss ratios for all policies in Year 2
Extract a subset containing the first 2 policies and first 2 years
Get every other policy (1st, 3rd, etc.)

Hint

NumPy indexing: array[row_index, column_index]
Use : to select all elements along an axis: array[:, 1] gets all rows, column 1
Slicing syntax: start:stop:step
For subsets: array[0:2, 0:2] gets first 2 rows and first 2 columns
Step slicing: array[::2, :] gets every other row (step=2)

Solution

Part C: Booleans

Find all loss ratios greater than 0.80
Identify which policies have any year with loss ratio > 0.80
Count how many loss ratios are between 0.60 and 0.80
Create a “high risk” mask for policies with average loss ratio > 0.75

Hint

Boolean operations: >, <, >=, <=, ==
Combine conditions with & (and) and | (or): (condition1) & (condition2)
Use parentheses around each condition when combining
np.any(array, axis=1) checks if any element is True along rows
np.where(condition) returns indices where condition is True
np.sum(boolean_array) counts True values (True=1, False=0)
Average calculation: $\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i$
np.mean(array, axis=1) calculates average along rows (for each policy)
Risk assessment: Policies with high loss ratios indicate higher risk

Solution

Exercise 3

Now, let’s apply NumPy’s mathematical functions to actuarial calculations.

Part A: Premium Calculations

Calculate risk-adjusted premiums for 5 policies with base premium $1000 and risk multipliers: [0.8, 1.2, 1.5, 0.9, 1.8]
Compute present values for the following cashflows using 4% annual discount rate:
- $5000 in year 1
- $7500 in year 2
- $10000 in year 3
Calculate monthly compound interest for initial amount $100,000 at 6% annual rate over 5 years

Hint

Vectorized multiplication: array1 * array2 multiplies element by element
Array arithmetic: array + value, array - value
Present value formula: $PV = \frac{FV}{(1 + r)^n}$ or $PV = FV \times (1 + r)^{-n}$
- Where: $FV$ = future value, $r$ = discount rate, $n$ = number of periods
Use np.array() to create arrays of cashflows and years
Compound interest formula: $A = P(1 + r)^n$
- Where: $A$ = final amount, $P$ = principal, $r$ = interest rate, $n$ = periods
For monthly compounding: monthly rate = annual rate / 12, periods = years × 12
np.arange(1, months+1) creates array [1, 2, 3, …, months]

Solution

Part B: Statistical Analysis

Using the claim amounts from Exercise 1:

Calculate mean, median, standard deviation, minimum, and maximum
Find the 25th, 50th, and 75th percentiles
Calculate the coefficient of variation (std/mean)
Identify claims that are more than 1 standard deviation above the mean

Hint

Basic statistics: np.mean(), np.median(), np.std(), np.min(), np.max()
Percentiles: np.percentile(array, [25, 50, 75]) gives 25th, 50th, 75th percentiles
Coefficient of variation: $CV = \frac{\sigma}{\mu}$ (standard deviation / mean)
- Measures relative variability; useful for comparing risk across different claim sizes
Outlier detection: Values beyond $\mu \pm k\sigma$ (often $k=1, 2, or 3$)
- For this exercise: threshold = $\mu + \sigma$ (mean + 1 standard deviation)
len(array) gives the number of elements
Boolean indexing: array[array > threshold] selects elements meeting condition

Solution

Exercise 4

In this part, we’ll focus on a a practical actuarial application by creating a simple life insurance premium calculation system.

You need to calculate annual premiums for term life insurance policies. Given:

Face amounts: [$100,000, $250,000, $500,000, $1,000,000]
Ages: [25, 35, 45, 55]
Base mortality rates: [0.001, 0.002, 0.005, 0.012] (annual probability of death)
Interest rate: 3% annually
Policy term: 20 years
Loading factor: 20% (to cover expenses and profit)

Now, given the information above, do the following:

Create arrays for all the given data
Calculate the expected present value of death benefits for each policy
Calculate the expected present value of a $1 annuity for 20 years at each age
Calculate the net annual premium (expected PV of benefits / expected PV of annuity)
Calculate the gross annual premium (net premium × loading factor)
Create a summary table showing age, face amount, net premium, and gross premium

Hint

Start by creating arrays for all input data using np.array()
Expected Present Value of Death Benefits: $EPV_{benefits} = \sum_{t=1}^{n} {}_{t-1}p_x \cdot q_{x+t-1} \cdot v^{t-0.5}$
- Where: ${}_{t-1}p_x$ = survival probability, $q_{x+t-1}$ = death probability, $v^{t-0.5}$ = discount factor
Expected Present Value of Annuity: $EPV_{annuity} = \sum_{t=1}^{n} {}_{t-1}p_x \cdot v^{t-1}$
Net Premium Formula: $P_{net} = \frac{Face Amount \times EPV_{benefits}}{EPV_{annuity}}$
Gross Premium: $P_{gross} = P_{net} \times (1 + loading)$
Survival probability: ${}_{t-1}p_x = (1-q)^{t-1}$
Death probability in year t: ${}_{t-1}p_x \times q$
Present value discounting: $v^t = (1 + i)^{-t}$ where $i$ is interest rate
Use np.sum() to add up present values across all years
Broadcasting: face_amounts[:, np.newaxis] creates column vector for matrix operations

Solution

Exercise 5

In this part, we’ll focus on some advanced array operations, namely reshaping, broadcasting, and advanced manipulations.

Part A: Portfolio Analysis

You have quarterly premium data for 3 product lines over 2 years (8 quarters total):

Product A: [100, 120, 110, 130, 125, 140, 135, 150] (in thousands)
Product B: [200, 180, 220, 210, 190, 230, 225, 240]  
Product C: [50, 60, 55, 65, 58, 70, 68, 75]

Create a 2D array with products as rows and quarters as columns
Reshape the data to show annual totals (2 years × 3 products)
Calculate quarterly growth rates for each product
Find the best and worst performing quarters for each product

Hint

Create 2D array with np.array([product_a, product_b, product_c])
For annual totals: use .reshape(3, 2, 4) to get (products, years, quarters)
Then use np.sum(axis=2) to sum over quarters
Growth rate formula: $Growth\,Rate = \frac{New\,Value - Old\,Value}{Old\,Value} \times 100\%$
Year-over-year growth: $YoY = \frac{Year_2 - Year_1}{Year_1} \times 100\%$
For quarterly growth: compare each quarter to the previous one
np.argmax() and np.argmin() find indices of max/min values
Convert quarter index to readable format: Q{(index%4)+1}Y{(index//4)+1}

Solution

Part B: Broadcasting

Apply broadcasting to calculate insurance premiums across multiple dimensions:

3 age groups: [25, 40, 55]
4 coverage amounts: [$100k, $250k, $500k, $1M]
Base rate: $2 per $1000 of coverage
Age multipliers: [0.8, 1.0, 1.5]

Calculate the premium matrix showing all combinations.

Hint

Premium calculation formula: $Premium = \frac{Coverage}{1000} \times Base\,Rate \times Age\,Multiplier$
Broadcasting principle: Arrays with compatible shapes can be operated on together
- Compatible shapes: (3,1) and (1,4) → result (3,4)
Reshape arrays: ages.reshape(-1, 1) makes column vector, coverage.reshape(1, -1) makes row vector
Base premium calculation: (coverage / 1000) * base_rate
Broadcasting automatically expands arrays to compatible shapes
Final operation: base_premiums * age_multipliers_column
Result will be 3×4 matrix (3 ages × 4 coverage amounts)
Use nested loops and f-string formatting for nice table output

Solution

Summary

These exercises covered the essential NumPy skills for actuarial science:

Array Creation: Using appropriate data structures for actuarial data
Indexing & Slicing: Accessing specific policies, years, or conditions
Boolean Operations: Filtering high-risk policies and claims
Mathematical Operations: Premium calculations and present value computations
Statistical Analysis: Risk assessment and portfolio analysis
Advanced Operations: Reshaping data and using broadcasting for multi-dimensional calculations