3.12-3.13 Developing Procedures

*before we do this lesson, you will want to open a terminal on your desktop and pip install Requests, Pillow, Pandas, NumPy, Scikit-Learn, Tensorflow, and matplotlib.

Learning Objectives:

  • Select approproate libraries or existing code segments to use in creating new programs

What are Libraries?

In Python, a library is a collection of pre-written code, functions, and modules that extend the language’s capabilities. These libraries are designed to be reused by developers to perform common tasks, rather than having to write the code from scratch. Libraries are essential for simplifying and accelerating the development process, as they provide a wide range of tools and functions for various purposes.

Here are some key points about Python libraries:

  1. Modules: Libraries in Python consist of modules, which are individual Python files containing functions, classes, and variables related to a specific set of tasks or a particular domain. You can import these modules into your own Python code to access their functionality.

  2. Standard Library: Python comes with a comprehensive standard library that includes modules for various tasks, such as working with files, networking, data processing, and more. These modules are readily available and do not require installation.

  3. Third-party Libraries: In addition to the standard library, there is a vast ecosystem of third-party libraries created by the Python community. These libraries cover a wide range of domains, including web development, data analysis, machine learning, game development, and more. Some popular third-party libraries include NumPy, Pandas, Matplotlib, TensorFlow, Django, Flask, and many others.

How Do We Get Libraries into Our Code and Working?</strong>

To get libraries into our code, we use the import statement followed by the library we want to import.
Lets start simply:

#In this code cell, we are importing the math library which allows us to do math operations,
#and the random library which lets us take pseudorandom numbers and choices.
import math
import random
#We use the libraries by first calling them by their name, then using one of their methods.
#For example:
num = 64
print(math.sqrt(num))
numList = [1,2,3,4,5,6]
print(random.choice(numList))
#Here, 'math' and 'random' are the names of the libraries, and 'sqrt' and 'choice' are the names of the methods.
We can also import parts of libraries by adding a "from" in front of our import.


from math import sqrt
from random import *
num = 64
print(sqrt(num))
numList = [1,2,3,4,5,6]
print(choice(numList))
8.0
4
Now, we don't have to use math. in front of sqrt, and can just use the function by itself. We can also import *, or all, which makes it so that everything is imported. Here, we don't have to use random in front of choice, even though we didn't import choice specifically. Popcorn Hack #1


Import your own library from a list of provided libraries, and use one of its methods. This can be something very bare bones, such as printing the time, getting a random number in a list, or doing something after sleeping a certain amount of time ```python #math library module examples: sqrt(num), square(num), cube(num), factorial(num) #random library module examples: choice(list), randrange(lowest, highest, step[numbers chosen in multiples of {step}]) #datetime library module examples: datetime.now() #sleep library module examples: sleep(milliseconds) import random print(random.randint(0,1)) ``` 1 Documentation</strong>

Documentation in Python libraries refers to the written information and explanations provided to help users understand how to use the library, its classes, functions, and modules. It serves as a comprehensive guide that documents the library's functionality, usage, and often includes code examples. Documentation is typically created by the library developers and is an essential component of a well-maintained library.
Examples of Documentation: An introductory section explaining the purpose of the library, a section on how to install the library, basic usage examples, etc. ```python calcAverage(grades) ''' You know the name of the procedure and the perameters, but... You probably wouldn't be able to use this procedure with confidence because you don't know its function exactly (maybe you can guess that it finds the average, but you wouldn't know if it uses mean, mode, or median to find the average). You would also need more information on the perameters. ''' ``` --------------------------------------------------------------------------- NameError Traceback (most recent call last) /tmp/ipykernel_5784/684100791.py in ----> 1 calcAverage(grades) 2 3 ''' 4 You know the name of the procedure and the perameters, but... 5 You probably wouldn't be able to use this procedure NameError: name 'calcAverage' is not defined Libraries and APIs</strong>

- A file that contains procedures that cane be used in a program is called a library - An Application Program Interface (API) provides specifications for how procedures in a library behave and can be used. APIs define the methods and functions that are available for developers to interact with a library. They specify how to make requests, provide inputs, and receive outputs, creating a clear and consistent way to use library features. Which libraries will be very important to us?

  • Requests - Simplifies working with HTTP servers, including 'request'-ing data from them, and recieving it
  • Pillow - Simplifies image processing
  • Pandas - Simplifies data analysis & manipulation
  • Numpy - Vastly quickens functionality of arrays up to 50 times faster than regular python list
  • Scikit-Learn - Implements machine learning models and statistical modelling
  • Tensorflow - Data automation, model tracking, performance monitoring, and model retraining
  • Matplotlib - Creates static, animated, and interactive visualizations in Python

DON'T FORGET TO DOWNLOAD ALL OF THESE (pip install "library")

Popcorn Hack #2


Using the requests library and the ? module (since we should already be using this in our backend) GET a request from the api at "https://api.github.com" ```python import requests #GET a request using the requests library. Remember to put your api link in quotes! If you get something along the lines of response [200] then you succeeded x=requests.get('https://api.github.com') print(x.status_code) ``` 200 ## Scikit-Learn and Numpy This code uses NumPy to create an array, and Scikit-Learn to analyze the data. It creates a linear regression which describes the relationship between the x and y arrays which reperesent independent and dependent variables. In simpler terms, it is creating a line of best fit between the two data sets, just like how you would in something like desmos. > ```python import numpy as np import numpy as np from sklearn.linear_model import LinearRegression from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error # Generate some example data X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1) # Feature (independent variable) y = np.array([2, 4, 5, 4, 5]) # Target (dependent variable) # Split the data into training and testing sets X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Create a Linear Regression model model = LinearRegression() # Fit the model to the training data model.fit(X_train, y_train) # Make predictions on the test data y_pred = model.predict(X_test) # Evaluate the model by calculating the Mean Squared Error mse = mean_squared_error(y_test, y_pred) # Print the model coefficients and MSE, model coefficient is the slope of the linear regression line, MSE is how well the model is performing, the closer it is to 0 the better print("Model Coefficients:", model.coef_) print("Mean Squared Error:", mse) intercept = model.intercept_ slope = model.coef_[0] print(f"Linear Regression Equation: y = {intercept} + {slope} * X") ``` /usr/lib/python3/dist-packages/scipy/__init__.py:146: UserWarning: A NumPy version >=1.17.3 and <1.25.0 is required for this version of SciPy (detected version 1.26.1 warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}" Model Coefficients: [0.68571429] Mean Squared Error: 0.7346938775510206 Linear Regression Equation: y = 1.7714285714285714 + 0.6857142857142857 * X ## Request - The requests module allows you to send HTTP requests using Python. - In order to download requests, you would have to type pip install requests in your terminal. ## Syntax - requests.methodname(params) ```python import requests x = requests.get('http://127.0.0.1:9008/') print(x.text) #not functional code, example of syntax ``` --------------------------------------------------------------------------- KeyboardInterrupt Traceback (most recent call last) /tmp/ipykernel_5784/883025194.py in 1 import requests 2 ----> 3 x = requests.get('http://127.0.0.1:9008/') 4 5 print(x.text) ~/.local/lib/python3.10/site-packages/requests/api.py in get(url, params, **kwargs) 73 74 kwargs.setdefault('allow_redirects', True) ---> 75 return request('get', url, params=params, **kwargs) 76 77 ~/.local/lib/python3.10/site-packages/requests/api.py in request(method, url, **kwargs) 58 # cases, and look like a memory leak in others. 59 with sessions.Session() as session: ---> 60 return session.request(method=method, url=url, **kwargs) 61 62 ~/.local/lib/python3.10/site-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json) 531 } 532 send_kwargs.update(settings) --> 533 resp = self.send(prep, **send_kwargs) 534 535 return resp ~/.local/lib/python3.10/site-packages/requests/sessions.py in send(self, request, **kwargs) 644 645 # Send the request --> 646 r = adapter.send(request, **kwargs) 647 648 # Total elapsed time of the request (approximately) ~/.local/lib/python3.10/site-packages/requests/adapters.py in send(self, request, stream, timeout, verify, cert, proxies) 437 try: 438 if not chunked: --> 439 resp = conn.urlopen( 440 method=request.method, 441 url=url, ~/.local/lib/python3.10/site-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw) 668 669 # Make the request on the httplib connection object. --> 670 httplib_response = self._make_request( 671 conn, 672 method, ~/.local/lib/python3.10/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw) 424 # Python 3 (including for exceptions like SystemExit). 425 # Otherwise it looks like a bug in the code. --> 426 six.raise_from(e, None) 427 except (SocketTimeout, BaseSSLError, SocketError) as e: 428 self._raise_timeout(err=e, url=url, timeout_value=read_timeout) ~/.local/lib/python3.10/site-packages/urllib3/packages/six.py in raise_from(value, from_value) ~/.local/lib/python3.10/site-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw) 419 # Python 3 420 try: --> 421 httplib_response = conn.getresponse() 422 except BaseException as e: 423 # Remove the TypeError from the exception chain in /usr/lib/python3.10/http/client.py in getresponse(self) 1373 try: 1374 try: -> 1375 response.begin() 1376 except ConnectionError: 1377 self.close() /usr/lib/python3.10/http/client.py in begin(self) 316 # read until we get a non-100 response 317 while True: --> 318 version, status, reason = self._read_status() 319 if status != CONTINUE: 320 break /usr/lib/python3.10/http/client.py in _read_status(self) 277 278 def _read_status(self): --> 279 line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") 280 if len(line) > _MAXLINE: 281 raise LineTooLong("status line") /usr/lib/python3.10/socket.py in readinto(self, b) 703 while True: 704 try: --> 705 return self._sock.recv_into(b) 706 except timeout: 707 self._timeout_occurred = True KeyboardInterrupt: ```python import requests # Replace this URL with the website you want to request url = 'https://www.example.com' # Send a GET request to the URL response = requests.get(url) # Check if the request was successful (status code 200) if response.status_code == 200: # Print the content of the response (the HTML content of the webpage) print(response.text) else: # If the request was not successful, print an error message print(f"Failed to retrieve the page. Status code: {response.status_code}") ``` <!doctype html> Example Domain

Example Domain

This domain is for use in illustrative examples in documents. You may use this domain in literature without prior coordination or asking for permission.

More information...

## Pillow - Pillow is a imaging library that provides easy-to-use methods to include, change, save different image formats. - To dowload pillow onto your computer, you would enter the command pip install Pillow into your terminal. ```python from PIL import Image, ImageDraw, ImageFont # Create a new blank image width, height = 400, 200 image = Image.new("RGB", (width, height), "white") # Create an ImageDraw object draw = ImageDraw.Draw(image) # Draw a red line from (50, 50) to (350, 150) line_color = (255, 0, 0) # Red color draw.line((50, 50, 350, 150), fill=line_color, width=3) # Add text to the image text = "This was created using Pillow!" text_color = (0, 0, 0) # Black color font_size = 20 font = ImageFont.load_default() # Use a default font text_position = (50, 160) draw.text(text_position, text, fill=text_color, font=font) # Save or display the image image.show() image.show() #This opens the image in your default image viewer and when you stop the code, it will return an error, but don't worry about that ``` ![png](output_24_0.png) ![png](output_24_1.png) ```python from PIL import Image # Open an image original_image = Image.open('image.png') #replace image.png with valid image # Display information about the image width, height = original_image.size format = original_image.format print(f"Original Image Size: {width}x{height}") print(f"Original Image Format: {format}") # Resize the image to a new size new_size = (width // 2, height // 2) # Reduce the size by half resized_image = original_image.resize(new_size) # Save the resized image resized_image.save('resized_image.jpg') # Display information about the resized image resized_width, resized_height = resized_image.size print(f"Resized Image Size: {resized_width}x{resized_height}") # Show both the original and resized images original_image.show() resized_image.show() ``` --------------------------------------------------------------------------- FileNotFoundError Traceback (most recent call last) /tmp/ipykernel_5784/1645806658.py in 2 3 # Open an image ----> 4 original_image = Image.open('image.png') #replace image.png with valid image 5 6 # Display information about the image /usr/lib/python3/dist-packages/PIL/Image.py in open(fp, mode, formats) 2951 2952 if filename: -> 2953 fp = builtins.open(filename, "rb") 2954 exclusive_fp = True 2955 FileNotFoundError: [Errno 2] No such file or directory: 'image.png' ## Pandas This code utilizes pandas in the DataFrame form to organize the data in to a table with the categories on the horizontal axis and their values on the vertical. Pandas creates a way for the user to organize data in a much simpler form and in different styles depending on what the user wants ```python #This code utilizes pandas which is a way for you as a user, to create data tables that are much more organized #imports pandas so it's able to be used import pandas as pd #data is created and will be sorted from left to right into top to bottom data = {'Name': ['Matthew', 'Lindsay', 'Josh', 'Ethan'], 'Grade': [97, 92, 90, 80]} #defines a variable and utilizes pandas by using a DataFrame for the data df = pd.DataFrame(data) #tbere are other forms other than DataFrame, those are Series (single Column), Panel (3D), Multindex (multiple levels of index), and Categorical (categories), #increase the side count by one df.index += 1 print(df) ``` Name Grade 1 Matthew 97 2 Lindsay 92 3 Josh 90 4 Ethan 80 Another example code, this time utilizing both numpy and pandas ```python import pandas as pd import numpy as np # Sample data data = { 'Grade': ['A', 'B', 'A', 'C', 'B', 'C', 'A', 'B', 'A'], 'Percent': [94, 82, 91, 76, 89, 79, 92, 87, 99] } # Create a Pandas DataFrame df = pd.DataFrame(data) # Calculate the mean of each Grade using NumPy means = df.groupby('Grade')['Percent'].mean().reset_index() # Organize the results into a new data table result = pd.DataFrame({'Grade': ['A', 'B', 'C'], 'Mean Grade': means['Percent']}) result.index += 1 # Display the result print(result) ``` Grade Mean Grade 1 A 94.0 2 B 86.0 3 C 77.5 ## TensorFlow The provided code demonstrates a basic example of linear regression using TensorFlow and Keras. It begins by importing the necessary libraries, TensorFlow and NumPy. It then generates a synthetic dataset with 1000 samples, where the input features are random, and the target values are computed as a linear combination of the input features with added noise. A data pipeline is set up using TensorFlow, which includes shuffling and batching the data for efficient processing. A simple linear regression model is defined using Keras, consisting of one dense layer. The model is compiled with the Adam optimizer and mean squared error as the loss function. It is then trained on the synthetic data for ten epochs. After training, the model is used to make predictions on new data points, and the predictions are printed to the console. This code provides a basic illustration of how to perform a simple machine learning task with TensorFlow, from data generation to model training and prediction. ```python import tensorflow as tf import numpy as np # Create a synthetic dataset num_samples = 1000 input_data = np.random.rand(num_samples, 2) target_data = input_data[:, 0] * 2 + input_data[:, 1] * 3 + np.random.randn(num_samples) # Define a data pipeline using TensorFlow dataset = tf.data.Dataset.from_tensor_slices((input_data, target_data)) dataset = dataset.shuffle(buffer_size=num_samples) dataset = dataset.batch(32) dataset = dataset.prefetch(buffer_size=tf.data.AUTOTUNE) # Create a simple linear regression model using Keras model = tf.keras.Sequential([ tf.keras.layers.Dense(1, input_shape=(2,)) ]) model.compile(optimizer='adam', loss='mean_squared_error') # Train the model on the synthetic data model.fit(dataset, epochs=10) # Generate predictions new_data = np.array([[0.5, 0.7], [0.3, 0.2]]) predictions = model.predict(new_data) print("Predictions:", predictions) ``` 2023-10-26 13:38:10.189650: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. 2023-10-26 13:38:10.743712: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:9342] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered 2023-10-26 13:38:10.744440: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered 2023-10-26 13:38:10.748721: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered 2023-10-26 13:38:11.172207: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used. 2023-10-26 13:38:11.183848: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-10-26 13:38:17.096577: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Epoch 1/10 32/32 [==============================] - 1s 4ms/step - loss: 7.0194 Epoch 2/10 32/32 [==============================] - 0s 3ms/step - loss: 6.7075 Epoch 3/10 32/32 [==============================] - 0s 4ms/step - loss: 6.4113 Epoch 4/10 32/32 [==============================] - 0s 3ms/step - loss: 6.1275 Epoch 5/10 32/32 [==============================] - 0s 3ms/step - loss: 5.8550 Epoch 6/10 32/32 [==============================] - 0s 3ms/step - loss: 5.5927 Epoch 7/10 32/32 [==============================] - 0s 3ms/step - loss: 5.3429 Epoch 8/10 32/32 [==============================] - 0s 3ms/step - loss: 5.1042 Epoch 9/10 32/32 [==============================] - 0s 2ms/step - loss: 4.8753 Epoch 10/10 32/32 [==============================] - 0s 2ms/step - loss: 4.6575 1/1 [==============================] - 0s 90ms/step Predictions: [[1.059348 ] [0.5346993]] ## Matplotlib The provided Python code demonstrates the basic usage of Matplotlib, a popular library for creating data visualizations. In this example, we start by importing the Matplotlib's pyplot module, often aliased as plt. We define some sample data as lists for the X and Y values. Then, we create a figure and an axis object using plt.subplots(). Next, we plot the data points on the graph with ax.plot(x, y) and set a label for the line. We also add labels for the X and Y axes and set a title for the plot. To provide context for the plot, we include a legend with the label we set earlier. Finally, plt.show() is called to display the graph. When you run this code, it will generate a simple line plot displaying the data points with appropriate labels, a title, and a legend, making it a clear and informative visualization. ```python import matplotlib.pyplot as plt # Sample data x = [1, 2, 3, 4, 5] y = [2, 4, 6, 8, 10] # Create a figure and axis fig, ax = plt.subplots() # Plot the data ax.plot(x, y, label='Linear Line') # Set labels and title ax.set_xlabel('X-axis') ax.set_ylabel('Y-axis') ax.set_title('Simple Line Plot') # Add a legend ax.legend() # Display the plot plt.show() ``` ![png](output_33_0.png)

Homework Hack

1) Create a code that makes a data table which organizes the average values(mean) from a data set the has atleast 5 values per category and using 2 libraries, ex:
</br> 2) Create a Python script that downloads images from a website using the requests library, processes them with the Pillow library, and then performs data analysis with the Pandas library. ```python # homework 1 import pandas as pd import numpy as np dictionary = ["Bob", "Bill", "Billy"] grades = { "x": [80, 90, 85], "y": [90, 23, 100], "z": [80, 100, 70] } # Calculate mean grades for each student mean_grades = [np.mean(grades[i]) for i in grades] # Create DataFrame df = pd.DataFrame({'name': dictionary, 'mean_grades': mean_grades}) print(df) ``` name mean_grades 0 Bob 85.000000 1 Bill 71.000000 2 Billy 83.333333 ```python import os import requests from PIL import Image import pandas as pd from io import BytesIO def download_images(url_list, download_path='images'): os.makedirs(download_path, exist_ok=True) for i, url in enumerate(url_list): response = requests.get(url) if response.status_code == 200: # Process image using Pillow image = Image.open(BytesIO(response.content)) # Save the processed image img_path = os.path.join(download_path, f'image_{i + 1}.png') image.save(img_path) def analyze_images(image_folder='images'): # Get a list of image files image_files = [x for x in os.listdir(image_folder) if x.endswith('.png')] # Create a DataFrame to store analysis results data = {'Image name': [], 'Width': [], 'Height': [],'Size':[]} for img_file in image_files: img_path = os.path.join(image_folder, img_file) img = Image.open(img_path) # Store analysis results in the DataFrame data['Image name'].append(img_file) data['Width'].append(img.width) data['Height'].append(img.height) data['Size'].append(img.size) df = pd.DataFrame(data) print("Data Analysis Results:") print(df) if __name__ == "__main__": image_urls = [ 'https://images.pexels.com/photos/60597/dahlia-red-blossom-bloom-60597.jpeg?cs=srgb&dl=pexels-pixabay-60597.jpg&fm=jpg&_gl=1*1yud8c1*_ga*MTk0NjEyNTc2Mi4xNjk4NDYyNzA3*_ga_8JE65Q40S6*MTY5ODQ2MjcwOC4xLjAuMTY5ODQ2MjcwOC4wLjAuMA..', ] download_images(image_urls) analyze_images() ``` Data Analysis Results: Image name Width Height Size 0 image_1.png 3648 2736 (3648, 2736)