Read Chapter 4 Vocab and Notes From Chapter 5
content
ii.1 Numerical Calculation Tool Numpy
ii.1.1 Creation, Properties, and Operation of Arranges
i. Creation of the array
two. Attributes of arrays
3. Alphabetize of array elements
iv. Modification of assortment
5. Deformation of the array
2.i.2 array of arrays, full general functions and broadcast operations
7 calculation
2. Comparison
iii.ufunc part
4. The broadcast machinery of theufunc function
ii.1.three Numpy.random module random number generation
2.ane.4 Text files and binary access
ane. Admission for text files
ii. Binary format file admission
two.2 file operation
2.ii.1 File Bones Operation
Open file
ii. File object backdrop
3. File object method
4. Turn off the file
2.2.ii Read and write operations for text files
ii.2.iii File Management Method (Bone Module)
1. Files and directory lists
2. Certificate rename
three. Directory operation in Python
ii.3 Data Processing Tools Pandas
2.3.1 sequences and data frames of Pandas
Sequence
2. Data Box
2.three.ii Access to external files
i. Reading the text file
2. Access to the Excel file
3. Acquisition of data subset
2.four MatPlotLib visualization
2.four.ane Basic Usage
2.four.2 Visual Applications for MatPlotlib.Pyplot
Scatter plot
2. Multiple graphs are displayed in a graphics screen
3. Multiple graphs separately
iv. Bend of 3-dimensional infinite
5. 3-dimensional surface graphics
6. Contour map
7. Vector illustration (pointer map)
2.iv.iii Integrated application of visualization
ii.5 SCIPY.STATS Module Introduction
two.5.ane Random variables and distributions
Continuous random variables and distributions
2. Detached random variables and distributions
ii.5.2 Probability density part and distribution law visualization
Definition: Part Distribution Definition
two.i Numerical Adding Tool Numpy
2.ane.i Creation, Properties, and Functioning of Arranges
1. Creation of the array
Case ii.one Create an assortment case with an Array function
import numpy as np a = np.assortment([ii,4,viii,20,16,xxx]) # If you pass a list or tuple to the Assortment function, you volition construct a simple one-dimensional number of groups. b = np.array(((i,2,iii,iv,5),(half-dozen,7,8,9,10),(10,9,1,ii,3),(4,five,6,viii,9.0))) # If you laissez passer multiple nested lists or tuples, you lot will construct a two-dimensional array. # , the array function itself has a wise subclass, at that place is a bracket in the incoming list, and the incoming tuple has a small parenthery. If it is a nesting tuple, nesting has a layer of parentheses Print ('one-dimensional array:', a) # , indicate output multiple objects Print ('2D arrays: \ n', b) #print part usage, the format is as follows: #print(*objects, sep=' ', end='\n', file=sys.stdout) #objects - Indicates the object of the output. When multiple objects are output, they demand to be separated past, (comma). #SEP - Used to interval multiple objects. #cease - Used to set whatsoever end. The default is that the wrap \ northward, we tin alter to other characters. #file - the file object to exist written.
Example 2.2 Generate an array example with a office such as Arange, Empty, and Linspace
import numpy equally np A = np.arange (4, DTYPE = float) # Create [0,four) floating point count B = np.arange (0, 10, ii, dtype = int) # Create [0,10) Interval to 2 C = np.linspace (-1, 2, v) # Create an array, the element is [0, 2] is divided into 5 copies D = np.empty ((2, 3), int) # Create a two * three empty number #EMPTY function only allocates retentivity, not initialization, the fastest speed, return value random E = np.random.randint (0, 3, (ii, three)) # Create a random assortment of 2 * iii, and so [0,3) print(a),print(b),print(c),print(d),print(e)
Example 2.three Generate an array using imaginary unit of measurement "J"
import numpy as np A = NP.LINSPACE (0, 2, 5) # Create an assortment, the chemical element is [0, two] is divided into 5 copies B = np.mgrid [0: 2: 5J] # equivalent to np.linspace (0, two, five) x, y = np.mgrid [0: 2: 4j, 10:20: 5j] # Generate X array, [0, 2] to generate Y assortment, [ten, twenty] print(a),impress(b) Print ('x = {} \ ny = {}'. Format (10, y)) # Format method Replaces the previous {} with parameters.
two. Attributes of arrays
Example 2.iv Generates a random integer matrix with a 3 * five [ane, ten], and displays his attributes
Cognition points: The backdrop of the array are as follows
- NDIM Return Value INT, indicating the number of dimensions
- Shape returns the value of the dollar group, indicating the size of the assortment, EG: (iii, four) represents 3 lines 4 column matrices
- Size Return Value Int, indicating the total number of elements of array
- DTYPE Returns Data Type
- Itemsize render value int indicating the size of each element (byte)
import numpy every bit np a=np.random.randint(ane,11,(3,5)) Impress ('dimension: {}'. Format (a.ndim)) Print ('dimension: {}'. Format (a.shape)) Print ('Element Total:}'. Format (a.size)) Print ('blazon: {}'. Format (a.dtype)) PRINT ('Each chemical element byte: {}'. Format (a.Itemsize))
Instance 2.5 Three modes of generating a mathematical one-dimensional vector
import numpy as np a=np.array([one,2,3]) b=np.array([[1,2,three]]) c=np.assortment([[one],[two],[3]]) print(a,a.shape),print(b,b.shape),print(c,c.shape)
3. Index of assortment elements
- General index
import numpy as np a=np.array([two,iv,8,twenty,16,30]) b=np.assortment(((1,2,3,iv,5),(six,seven,8,9,10),(10,ix,1,two,3),(4,5,half-dozen,8,9.0))) print(a) impress('-------------------------') impress(b) print('-------------------------') Impress (a [[ii, three, 5]]) # 1D number of group indexes, output [8 20 thirty] print('-------------------------') Impress (a [[- 1, -2, -3]] #niter: Index, Output [30 16 20] \ print('-------------------------') Print (B [1,2]) # 2D array alphabetize, output the second line of tertiary column elements print('-------------------------') Print (B [2]) # 2d number of array indexes, output third line elements impress('-------------------------') Print (B [ii ,:]) # 2nd array alphabetize, output third line elements impress('-------------------------') Print (B [:, 1]) # 2D assortment index, output second column chemical element impress('-------------------------') Impress (B [two, 3], one: four]) # outputs the tertiary, fourth, 2nd, 3, four column elements (before and after the list, no later on the slice) print('-------------------------') Print (B [one: 3, 1: 3]) # Outputs 2d, tertiary, 2nd, iii column elements
As shown in the higher up results, in the index of one-dimensional array, the index of whatsoever position tin can be assembled as a list, and it is used as a acquisition of the corresponding element; in the two-dimensional array, the position index must exist written into the form of [ROWS, COLS], The first half of the parentheses is used to control the row index of the two-dimensional array, and the second half is used to control the cavalcade index of the array. If you need to go all rows or column elements, the corresponding row index or column index requires a colon with an English country. Express.
- Boolean
from numpy import array,nan,isnan a=assortment([[i,nan,2],[iv,nan,3]]) B = a [~ isnan (a)] # ~ indicates to reverse impress("b=",b) Print ('B is greater than two elements:', b> 2])
- Fancy index
from numpy import array 10 = assortment([[1,2],[3,4],[5,6]]) Print ('front two elements: \ n', 10 [[0, i]]) Impress ('ten [0] [0] and X [1] [1]:', x [[0, 1], [0, 1]]) # The following ii formats are the aforementioned print(x[[0,1]][:,[0,1]]) print(x[0:2,0:2])
The index value of the fancy index is an assortment. For the utilize of ane-dimensional integer array as an index, if it is indexed
The data is a 1-dimensional array, then the result of the index is the element of the respective location; if the number of indexes is enough - the dictation of the emmet group, the outcome of the index is the corresponding subscript.
For two-dimensional index data, the index value can be two-dimensional data, and when the index value is the same equally two dimensions
When a 1-dimensional array is a two-dimensional assortment of two dimensions, the unmarried value is combined into a new one-dimensional assortment subsequently the 3 dimensions.
4. Modification of array
(Here is the modification and dimension of the array elements and the reduction of dimensions)
import numpy every bit np x = np.array([[1,2],[iii,4],[five,half-dozen]]) X [2,0] = -1 # modified the value of the first column element of the 3rd line to -1 y=np.delete(x,2,axis=0) Print (Y) # Delete the third line of assortment z=np.delete(y,0,centrality=1) Impress (z) # Delete the starting time column of the array t1=np.append(ten,[[vii,viii]],axis=0) Print (T1) # Adds a line to the array x, the content is [7,8] t2=np.append(x,[[nine],[ten],[11]],axis=1) Print (T2) # Add a column to assortment x, content is [9, 10, 11]
- Direct modification:
Straight modify with full general index
- Delete a line (column):
Numpy.delete () function At that place are iii parameters: the first parameter: the assortment name of the functioning; the second parameter: line number (column number); 3rd parameters: axis = 0 is modified, AXIS = 1 modifies column.
- Add a line (a column):
Numpy.Suspend () function At that place are iii parameters: the kickoff parameter: the array proper noun of the operation; the second parameter: to increase the content. Note that the outermost layer is brackets, and one of the bursts in the internal incremental line can enclose all elements. When the column is increasing, a subclass includes an element; the third parameter: centrality = 0 is modified, Centrality = i is modified List.
5. Deformation of the array
Example 2.10 RESHAPE and RESIZE Modification
import numpy as np A = np.arange (4) .reshape (ii, 2) # [[0,1] [2, three]]] B = np.arange (4) .reshape (2, two) # [[0,ane] [ii, 3]]] Impress (A.reshape (4,), '\ n', a) # output [0, 1, 2, 3] and [[0, i] [2, 3]]] Impress (B.Resize (4,), '\ due north', b) # Output None and [0, 1, 2, 3]
The difference between reshape and resize in array deformation:
1.reshape office Upwardly to three parameters, A.reshape (M, N, S), that is, the array A becomes the array of M N-rows S columns. If only ii parameters are given, due north, s. The return value is the result of the change, does not modify the original array.
2.Resize function Up to three parameters, A.Resize (M, N, South), that is, the array A becomes the array of M N rows S columns. If only two parameters are given, n, s. No return value, change the original array.
Case 2.11 assortment drop-dimensional example
import numpy as np A = np.arange (four) .reshape (2, 2) # [[0,one] [2, 3]]] B = np.arange (4) .reshape (2, 2) # [[0,1] [2, 3]]] C = np.arange (4) .reshape (ii, 2) # [[0,1] [2,3]]]]]] Print (A.reshape (-1), '\ n', a) # output [0, 1, 2, 3] and [[0, 1], [two, 3]] Print (B.ravel (), '\ north', a) # output [0, 1, ii, iii] and [[0, 1], [2, iii]] Print (C.FLATTEN (), '\ n', a) # output [0, 1, two, 3] and [[0, 1], [2, 3]]
Selection of multi-dimensional sets as one-dimensional time can be selectedA.FLATTEN () method,a.ravel () methodThey accept not changed the original array, but returns a new number of new arrays.a.reshape(-1)The method can also be used to do design, and the parameters tin can be set to -ane.
Instance ii.12 Array combination issue example
import numpy equally np A = np.arange (4) .reshape (two, 2) # [[0,i] [2, 3]]] B = np.arange (4, 8) .reshape (2, 2) # [[four,five] [6,7]]]]]]]] C1 = np.vstack ([A, B]) # vertical direction combination C2 = np.r_ [a, b] # vertical management combination D1 = np.hstack ([A, B]) # horizontal direction combination D2 = np.c_ [a, b] # horizontal direction combination print(c1) print('---------------') impress(c2) print('---------------') print(d1) print('---------------') impress(d2)
Multiple array is divided into two types in vertical direction combination and horizontal direction.
Vertical direction combination: c1=np.vstack([a,b])
c2=np.r_[a,b]Note: vStack requires more parenthesesHorizontal direction combination:d1=np.hstack([a,b])
d2=np.c_[a,b]Note: HSTACK needs more than parentheses
Example ii.thirteen assortment segmentation example
import numpy as np A = np.arange (4) .reshape (2, 2) # [[0,1] [2, iii]]] B = np.hsplit (a, 2) # divided A boilerplate into two column arrays C = np.vsplit (a, ii) # divide A boilerplate into two row arrays impress(b) print('---------------') print(c) print('---------------')
Arrays can be segmented, or by partition
Partitioning past line:c=np.vsplit(a,2)
Segmentation by cavalcade:b=np.hsplit(a,ii)
Both functions take ii parameters, the first parameter is the origin group of divide, and the second parameter is divided into the number of copies.
2.i.2 array of arrays, full general functions and circulate operations
VII adding
The 4 operations can exist used in the NUMPY library.+,-,*,/You can also use functionsadd(),substract(),multiply(),divide(). It should be noted that the function can only accept the performance of the two objects, if there are multiple objects, you tin can select an arithmetic symbol or nested operation.
In addition to the improver and subtraction, in that location are three math operators, respectively:Length, tie and abilityYou can utilize symbols:%,//,**You lot can also use functionsFMOD (), Modf () and Power (). It should be noted that the entire office is slightly complicated and writtennp.modf(a/b)[1],becauseMODF () function can render a fractional portion or integer part of the value。
Note that the above 7 mathematical calculations are calculated separately for each respective element.
Case 2.14 Elementary Operation example of assortment
import numpy as np A = np.arange (10, 15) # [10, xi, 12, 13, 14] B = np.arange (5, ten) # Generate an array [v, 6, 7, 8, nine] C = a + b; D = a * b # corresponds to elements to plug or multiplied E1 = np.modf (a / b) [0] # corresponding to the decimal portion of the elements, the return value is the floating betoken number E2 = np.modf (A / b) [one] # corresponding to the integer office of the element, the return value is the floating point number print(a) print('---------------') impress(b) print('---------------') impress(e1) print('---------------') impress(e2) print('---------------')
two. Comparison
Annotation: The comparison operator returns value is True or Imitation. Exist
The comparing operator combines the Boolean alphabetize to take out specific elements.
Instance ii.xv Comparative Operator Instance
import numpy every bit np a=np.array([[iii,4,ix],[12,fifteen,1]]) b=np.array([[ii,six,iii],[7,viii,12]]) Print (a [a> b]) # Outputs all elements greater than b Print (a [A> 10]) # Outputs all elements greater than 10 in A Impress (np.where (a> x, -1, a)) # a is greater than 10 elements to -i Print (np.where (a> ten, -1, 0)) # a is changed to -1, otherwise 0
The multidimensional array returns a one-dimensional number of groups through Boolean.
The assortment returned past np.where keeps the shape of the original array
three.ufunc function
The UFUNC function is called a universal office, which is a function that operates one by one in an array and is output as an NUMPY assortment. At nowadays, the NumpY library includes more than 60 general functions, including four operations, mode of modeling, absolute values, power functions, alphabetize functions, triangular functions, fleck operations, comparing operations, and logical operations.
Case ii.sixteen UFUNC Function Efficiency Example
import numpy every bit np,time,math x=[i*0.01 for i in range(1000000)] start=time.fourth dimension() for (i,t) in enumerate(x):10[i]=math.sin(t) print('math.sin:',fourth dimension.fourth dimension()-showtime) y=np.array([i*0.01 for i in range(one thousand thousand)]) start=fourth dimension.fourth dimension() y=np.sin(y) print('numpy.sin:',time.time()-start)
It can exist found that for assortment operations, the Numpy array is much less than the MATH module. Be
four. The broadcast mechanism of theufunc function
Broadcasting refers to the mode to perform arithmetic operations between arrays of different shapes. When the array is calculated using the UFUNC function, the UFUNC part calculates the respective elements of the two arrays. The premise for this calculation is the dimension of the two arrays.If the dimension of the two arrays is incompatible, NUMPY will implement the circulate mechanism.But the broadcast function of the assortment is rule, if these rules are not met,
It will be incorrect when the operation. The major broadcast rules of the array are:
(i) The dimensions of each input array may be equal, but must ensure that the respective dimension values from right to left are equal.(2) If the corresponding dimension value is not equal, one must be guaranteed to 1.
Example 2.17 Broadcast Machinery Example
import numpy as np a=np.arange(0,xx,10).reshape(-1,one) b=np.arange(0,3) print(a) print('-------------') impress(b) impress('-------------') print(a+b)
From the above example we can see that the meaning of the broadcast machinery is:Outset, the assortment B is expanded to an array of the same size, and and so makes an algorithm. Be
2.ane.3 Numpy.random module random number generation
2.one.4 Text files and binary access
The format of file access is divided into ii categories: binary and text, and the binary format file is divided into NUMPY defended format binary type and unformatted type.
1. Access for text files
(i) Access text files using SavetXT () and LOADTXT ()
The savetxt () function tin save one-dimensional and two-dimensional array to the text file.
LOADTXT () function tin can load data in a text file into a one-dimensional and 2-dimensional array
Example 2.18 Text File Access Example
import numpy equally np A = np.arange (0, iii, 0.5) .reshape (two, iii) # Generates a two * 3 array np.savetxt ('pdata2_18_1.txt', a) # defaults to save values in '% .xviii' format, separated by infinite # Save the upshot is as follows: #0.000000000000000000e+00 v.000000000000000000e-01 1.000000000000000000e+00 #1.500000000000000000e+00 2.000000000000000000e+00 2.500000000000000000e+00 b=np.loadtxt('Pdata2_18_1.txt') print('b=',b) np.savetxt ('pdata2_18_2.txt', A, FMT = '% D', DELIMITER = ',') # Salve every bit integer information, separated by commas # #0,0,one #ane,2,two C = np.loadtxt ('pdata2_18_2.txt', Delimiter = ',') #, y'all too demand to specify comma separation when reading print('c=',c)
Note that when you utilise the savetxt () function to save the assortment, if you lot want to save you need to add an FMT parameter in an integer
Example two.nineteen Text files PDATA2_19.TXT stores information in the following format, so read the information into the array A, and extracts the start line of the array A, the elements of the 2nd column to the fourth column, and constructs a 2 line three column of assortment B.
import numpy as np a=np.loadtxt('Pdata2_19.txt') B = a [0: 2, one: four] # Sliced index impress('b=',b)
Instance 2.20 The post-obit format data is placed in the text file to extract the value information. Be
import numpy as np A = np.loadtxt ('PDATA2_20.TXT', DTYPE = STR, DELIMITER = ',', Encoding = 'UTF-8') #, for separator, string format extraction 4 * iv assortment # eNCoding = 'UTF-8' Used to determine the encoding method of the original file, if this is not reported # : 'GBK' CODEC CAN't Decode Byte 0xAb in Position 23: ILLEGAL MULTIBYTE SEQUENCE B = a [1:, 1:]. ASTYPE (FLOAT) # extracts the numerical row and numeric columns of the matrix, and the type of conversion is. print('b=',b)
(ii) GenfroMTXT reads text file information
If you need to handle complicated data structures, such every bit processing missing data, you lot tin use GenfromTxt.
The parameter listing of the GenfromTxt function is every bit follows:
Instance 2.21 Purple text files PDATA2_21.TXT stores the post-obit information, read the first row of data, the value information of the 9th column, and the last line of data.
import numpy as np A = np.genfromtxt ('pdata2_21.txt', max_rows = 6, usecols = range (8)) # max_rows: Specify the maximum read row using device: Specify the cavalcade you need to read B = np.genfromtxt ('pdata2_21.txt', DTYPE = STR, MAX_ROWS = 6, usecols = [8]) # DETYPE Specifies the information type of the read data, the default is a floating signal, if the string is included, then you must specify Str b=[float(v.rstrip('kg')) for (i,5) in enumerate(b)] c = np.genfromtxt ('pdata2_21.txt', skip_header = 6) #SKIP_HEADER represents the number of kickoff lines that need to be skipped impress(a,'\n',b,'\n',c)
2. Binary format file admission
(1) Tofile () and fromfile () Access binaries
The Tofile () method of the array object tin can easily write the data in the array in a binary format into a file, and the data output from the TOFILE () does not save information such as array shape and element type. Therefore, the user specifies the element type when reading back data with the fromFile () office, and the shape of the assortment is appropriately modified.
Case ii.22 TOFILE () and fromFile () Acquisition Binary Format File Example
import numpy as np a=np.arange(six).reshape(2,iii) a.tofile('Pdata2_22.bin') b=np.fromfile('Pdata2_22.bin',dtype=int).reshape(two,three) print(b)
(2) LOAD (), Save (), Savez () Access Numpy dedicated binary format file
LOAD () andsave()Acquire information with NUMPY dedicated binary format, which will automatically process information such as elements and shapes.
If y'all want to salve multiple arrays to a file, you can applysavez(). Savez () The first parameter is the file name, and the subsequent parameters are arrays that demand to be saved, and the output is a compressed file that extends NPZ.
Case 2.23 Access Numpy Dedicated Binary Format File Case
import numpy every bit np a=np.arange(6).reshape(ii,3) np.save('Pdata2_23_1.npy',a) b=np.load('Pdata2_23_1.npy') c=np.arange(6,12).reshape(two,iii) d=np.sin(c) np.savez('Pdata2_23_2.npz',c,d) e=np.load('Pdata2_23_2.npz') Print (E ['Arr_0']) # extracts the data of the first array Impress (E ['Arr_1']) # extracts the 2nd array data
two.2 file operation
ii.two.one File Basic Operation
Open file
Whether it is a text file or a binary, its operational procedure is basically the same:
- First open the file and create a file object
- Read, write, delete, modify the file content through this file object
You can open the specified file past the OPEN () role Press the specified way and create a file object.
File object name = open (file proper name [, buffer])
The file name specifies the name of the file opened, and can expect for files through relative paths and absolute paths. When using the absolute path, pay attention to the parameter R or "\" writing "\\"; open way equally follows, specify the processing method afterwards opening the file; the buffer specifies the buffer fashion of the read and write file, and the value 0 indicates that there is no buffer. Value ane represents buffer, greater than ane indicates the size of the buffer, default is buffer mode.
If yous do normal, open () returns a file object if the file does not exist, and the permission is not plenty.
2. File object backdrop
3. File object method
4. Turn off the file
After the file is used, the Close () method can be used.
File object name .shut ()
Instance 2.24 File Object Holding Functioning Case
f = Open ('pData2_24.txt', 'W') # Create a file object # Output files Various properties print('Name of the file:',f.proper name) print('Closed or non:',f.closed) print('Opening mode:',f.style) f.close()
2.2.ii Read and write operations for text files
This section uses some examples to display the read and write operations of the text file. First utilise Notepad to constitute file pData2_25.txt, the contents are as follows:
Python is very useful.
Programming in Python is very piece of cake.
Example two.25 Statistical text file PDATA2_25.TXT number of interposed alphabets
f=open('Pdata2_25.txt','r') S = f.read () # Save the content in the class in Cord in S Print (southward) # Displays the contents of the file n=0 for i in due south: if i in 'aeiouAEIOU': north=n+i Print ('Yin Yin number:', northward)
Example 2.26 Write a data example to text files
f1=open('Pdata2_26.txt','w') str1=['Hello',' ','World'];str2=['Hi','Globe!'] f1.writelines(str1);f1.write('\northward') # file.write (str) parameter is a string, that is, you have to write to the contents of the file. # file.writeLines (Sequence) parameters are sequences, such as a list, which volition iterate to help you lot write files. f1.writelines(str2);f1.close() f2=open up('Pdata2_26.txt') a=f2.read() print(a)
File.Write (STR) parameters are a string, that is, you want to write to the contents of the file. File.writelines (sequence) parameters are sequences, such as a list, which will iterate to help y'all write files.
Example 2.27 (Reunue two.21) reads the showtime row of eight columns in the top of the pData2_21.txt file, the value data of the ninth column, and the last line of data.
import numpy as np a = []; b = []; c = [] with open('Pdata2_21.txt') as file: for (i, line) in enumerate(file): Elements = line.strip (). Separate () # Strip method can be used to remove spaces or specified characters on both sides of the string, Split method separates the string to space or other characters. if i < 6: A.apore (Listing, Elements [: viii]))) # MAP office Map the specified sequence based on the provided function. Here is all forced to catechumen to float b.append(float(elements[-i].rstrip('kg'))) else: c = [float(10) for x in elements] a = np.assortment(a); b = np.array(b); c = np.array(c) print(a,'\north',b,'\due north',c)
Use the WITH argument to open the data file and demark it to the File object, practise not have to worry about the data file after operating the data file. Be
two.2.3 File Management Method (OS Module)
1. Files and directory lists
Example 2.28 Display the specified directory content example
# p 22_28.py import os a=os.listdir("c:\\") Impress (a) # Displays a listing of files and directories under the citial directory print("-------------------------------------") b=os.listdir(".") Print (b) # Displays a listing of files and directories in the current working directory
Results were omitted for privacy.
2. Document rename
The rename () method can realize file rename, format
Os.Rename ('current file name ",' new file name ')
For example: Rename the file text1.txt: TEXT2.TXT:
import bone os.rename('text1.txt','text2.txt')
3. Directory operation in Python
(i) MkDir () Method - Create a new directory
Os.mkdir ('new directory name ")
(2) CHDIR () method - Alter the electric current directory
Bone.chdir ('To go the current directory ")
(3) GetCWD () Method - Brandish the current work directory
os.getcwd()
(4) RMDIR () method - Delete the empty directory
Os.rmdir ('to delete the directory name)
Apply the RMDIR method to delete the directory, you must first ensure that this directory is empty directory.
2.3 Data Processing Tools Pandas
When PANDAS, i of Python is one of the most powerful data assay and exploration tools, which tin can exist calculated, including mean, variance, number of points, correlation coefficients, and covariance.
Mean (): Arithmetics mean for computing sample data
STD (): The standard difference used to calculate the sample data
COV (): Co-divergence matrix for calculating sample data
VAR (): A variance used to calculate the sample information
Describe (): The basic state of affairs used to describe sample data, including non-NAN data, mean, standard departure, minimum, maximum value, and 25%, l%, and 75% positioner number of samples.
ii.3.1 sequences and information frames of Pandas
The range of PANDAS data structures can range from 1-dimensional to 3D. Series is one-dimensional, DataFrame (data box) is two-dimensional, Panel is a iii-dimensional or even more dimensional data structure.
Series is a tag-tab that can be used to store any type of data, such as integer, floating signal, strings, and other valid Python objects. Its row tag is called alphabetize.
DataFrame is a characterization two-dimensional assortment, has rows and columns. Columns can have multiple types. DataFrame tin can view an array of two-dimensional structures, such as spreadsheets, and database tables. DataFrame tin also exist considered as a collection of multiple different types of Series.
Panel data is a data structure that is less commonly used compared to Series and DataFrame.
Sequence
Amalgam a sequence can be implemented as follows:
- Through the aforementioned type of listing or tuple construction.
- Built through a dictionary.
- Built through a one-dimensional array in Numpy.
- Built through a column in the information box.
Case two.29 Sequence Construction Example
import pandas as pd import numpy as np S1 = pd.series (np.assortment ([10.5, 20.5, 30.5]) #named constructor sequence S2 = pd.serial ({"Beijing": 10.v, "Shanghai": 20.5, "Guangdong": 30.v}) # By lexicon constructor S3 = pd.series ([ten.5, 20.5, 30.5], Alphabetize = ['B', 'C', 'D']) # Named the Travel Tag print(s1); print("--------------"); print(s2) impress("--------------"); print(s3)
Information technology can be seen from the above display results that the sequence has two columns:
The sequence constructed from the array, the start column is the row alphabetize of the sequence (can be understood as a line number), and the 2nd cavalcade from 0 is the actual value of the sequence.Through the sequence synthetic in the lexicon, the first cavalcade is the specific row name (index) corresponding to the key in the lexicon, the second column is the actual value of the sequence, respective to the value in the dictionary.
The sequence is highly like to ane-dimensional assortment, and all methods for obtaining a one-dimensional assortment element can be practical to the sequence, and the mathematical and statistical functions of the array tin can too exist applied to the sequence object, and the sequence will be More other processing methods
Example ii.30 Sequence Alphabetize and Adding Example
import pandas as pd import numpy as np s=pd.Serial([10.5,xx.five,98], Index = ['a', 'b', 'c']) # Proper noun the build sequence to the travel tag A = s ['b'] # 2 Element, output: 20.5 B1 = np.mean (s) # Since the Mean method in Numpy, the arithmetics average B2 = due south.hateful () # for average by PANDAS method print(s);impress('-------------') impress(b1,b2)
2. Data Box
DataFrame is a two-dimensional data construction composed of rows and columns. Reno alphabetize and column names are optional, but information technology is all-time to prepare them. Indexes can exist seen as a line tag, the column name can exist seen as a column label.
The creation method of the data box is as follows:
DataFrame (DATA = 2D data [, index = line index [, columns = cavalcade index [, DTYPE = information type]]])
The Data tin be a two-dimensional Numpy assortment; when the Data is a dictionary, its value is a one-dimensional array, the central is the cavalcade name of the data box.
Example two.31 Constructing data box example
import pandas as pd import numpy as np a=np.arange(one,7).reshape(3,2) df1=pd.DataFrame(a) df2=pd.DataFrame(a,index=['a','b','c'], columns=['x1','x2']) df3=pd.DataFrame({'x1':a[:,0],'x2':a[:,1]}) print(df1); print("---------"); impress(df2) impress("---------"); print(df3)
2.3.two Admission to external files
1. Reading the text file
The READ_CSV function in the PANDAS module, yous tin can read TXT or CSV (comma-separated text files) text format data, the call format is:
Read_csv (parameter list)
The common parameters are as follows:
Example two.32 Read the TXT text information as shown below
import pandas equally pd a=pd.read_csv("Pdata2_32.txt",sep=',',parse_dates={'altogether':[0,1,ii]}, #parse_dates parameter The engagement resolution of the peak three columns through the dictionary and merges for the new field birthday # If the parse_dates parameter value is true, endeavour parsing the row index of the data box; if the parameter is a list, resolve the corresponding date column; if the parameter is a nested list, some columns merge into the date cavalcade; if the parameter is a dictionary , Resolve the corresponding cavalcade (the value in the dictionary) and generate a new field proper name (the fundamental in the dictionary); skiprows=2,skipfooter=two,annotate='#',thousands='&',engine='python') #SKiprows Specifies the number of rows that need to exist turned out at the start of the data set up, and Skipfooter specifies the number of rows that need to exist transferred at the end of the data set, and the Comment identifies the excess row. This line volition be ignored if the character appears in the order. This parameter can only be a character, and the bare line (just like skip_blank_lines = true) comment line is ignored by Header and Skiprows. #thousands Thousand Dithertucks, such as "," or "."; Engine used by Engine. You can choose C or Python. The C engine is fast just the Python engine is more than complete. print(a)
Pandas.read_csv parameter detailed _weixin_30487317 weblog-CSDN web logHere is this web log, which has a detailed explanation of the list of read_csv's parameters.
2. Access to the Excel file
The read.excel () office can read the data in the Excel file, which is commonly used in the format.
read_excel(io,sheet_name=0,header=0,names=None,index_col=None,parse_cols=None,usecols=None,dtype=None)
in,
- IO: Excel file proper name.
- Sheet.Proper noun: Table proper noun or form serial number.
Case 2.33 Excel file PDATA2_33.XLSX files are shown in the following figure, the data characteristics of the statistics
import pandas equally pd A = pd.read_excel ("PDATA2_33.XLSX", usecols = range (1,4)) # extracts the data column i to 4 B = a.values # extracts the data C = a.describe () # Statistical clarification of information print(a);print('--------') print(b);print('--------') print(c);impress('--------')
An example of writing a file to an Excel is given beneath. Exist
Example 2.34 reads the data in the Excel file pData2_33.xlsx, and so writes the two forms and sheet2 of another file pData2_34.xlsx
import pandas equally pd import numpy every bit np A = pd.read_excel ("PDATA2_33.XLSX", usecols = range (1,four)) # extracts the data column 1 to 4 B = a.values # extracts the data # Generate dataframe type data C = pd.dataframe (b, index = np.arange (1, 11), column = ["User A", "User B", "User C"]) f = pd.excelwriter ('pdata2_34.xlsx') # Create a file object C.to_Excel (f, "sheet1") # Write C to an Excel file C.TO_EXCEL (F, "Sheet2") #C and and then writes in some other form f.save()
3. Acquisition of information subset
Pandas library can be usediloc()andloc()2 methods get information subset, and their grammer can exist expressed:
[rows_select,cols_select]
ILOC can but filter data by the line number and the column number, which is similar to the index mode of the array. Information technology is started from 0. It tin can exist spaced apart, and it is still unable to take the upper limit for the slice.
LOC can specify specific row tags (row names) and column tags (field names), merely as well specify row.select as specific filter criteria.
Example 2.35 (Continued 2.33) Read six data before users A and B
import pandas as pd import numpy as np A = pd.read_excel ("PDATA2_33.XLSX", usecols = range (one,four)) # extracts the data column ane to 4 B1 = a.iloc [np.arange (6), [0, i]] # Filter data by reference numeral B2 = a.loc [np.arange (6), ["User A", "User B"]] # Filter Information past Tag print(b1);print('------------------') print(b2)
ii.4 MatPlotLib visualization
two.4.one Basic Usage
Matplotlib proposes Object Container (the concept of object container, which has four types of object containers in Figure, Axes, Axis, Tick. The four object containers are the human relationship contained in the layer.
- Figure is responsible for graphical size, location, etc.
- AXES is responsible for the position of the coordinate axis, drawing and other operations;
- Centrality is responsible for the setting of the coordinate axis;
- Tick is responsible for the style of formatting the scale.
The MATPLOTLIB module tin can be drawn by a plot () office, the syntax is every bit follows:
plot(x,y,s)
x: Data points x coordinate Y: information bespeak Y coordinate S: Specify string of line color, style, data signal shape, etc.
The mutual parameters of the PLOT () part are as follows:
- LINESTYLE: Specifies the type of the fold line, which can exist a solid line, dashed line and indicate line, etc., default is the solid line.
- LINEWIDTH: Specifies the width of the fold line.
- Marking: You lot can add together a point for a drawing diagram, which sets the shape of the point.
- MARKERSIZE: Set up the size of the point.
- MarkeredgeColor: Sets the side frame of the bespeak.
- MarkerFaceColor: Set the padding color of the signal.
- MarkeredGewidth: Set the edge width of the betoken.
- Characterization: The label of the drawline map is similar to the role of the legend.
- Alpha: Set the transparency of graphics.
MATPLOTLIB.PYPLOT module Other mutual functions have
- PIE (): Describe pie chart.
- BAR (): Draw a column nautical chart.
- Hist (): Draw a two-dimensional straight view.
- Scatter (): Describe a scatter plot.
Example two.36 Moving picture of three users in X users of Example 2.33 Ten days of full consumption
import numpy equally np, pandas equally pd from matplotlib.pyplot import * A = pd.read_excel ("PDATA2_33.XLSX", usecols = range (i,4)) # extracts the data column 2 to fourth cavalcade C = np.sum (a) # seek each column ind=np.assortment([1,2,3]); width=0.2 RC ('font', size = 16); BAR (IND, C, Width) # x, y, s format; Ylabel ("Consumer Information") Xticks (IND, ['User A', 'User B', 'User C'], Rotation = 20) # xx twenty twenty 20 degrees RcParams ['font.sans-serif'] = ['simhei'] # is used to display Chinese label unremarkably Savefig ('figure2_36.png', DPI = 500) # Salvage Epitome as file figure2_36.png, pixel is 500 show()
Annotation two.half-dozen MATPLOTLIB Drawing Displays Chinese usually garbled, if you want to display Chinese characters, negative numbers, etc. in the graph, you lot demand to use the post-obit code to prepare up
RcParams ['font.sans-serif'] = ['simhei'] # Used to normalize Chinese tag rcparams ['axes.unicode_minus'] = false # Used to log negative
Or is the equivalent to write as
RC ('Font', Family unit = 'Simhei') # Used to normalize Chinese label
RC ('axes', unicode_minus = false) #. Used to brandish a negative sign unremarkably
2.4.2 Visual Applications for MatPlotlib.Pyplot
Besprinkle plot
Example ii.37 In social club to mensurate the clothing speed of the tool, do such an experiment: After a sure period of time (such as every hour), mensurate the thickness of the tool, resulting in a set of experimental information (T, Yi) (i = 1, ii, ..., 8) Equally shown in Table 2.11. Division of the observed information.
import numpy as np from matplotlib.pyplot import * ten=np.array(range(8)) Y = '27 .0 26.8 26.5 26.three 26.one 25.7 25.3 24.eight '# information is pasted y = "," Bring together (y.separate ()) # replace the space with a comma print(y) print(eval(y)) Y = np.array (Eval (y)) # Too much trouble, we use program conversion #eval function Transforms the original string into a tuple grouping print(y) scatter(x,y) savefig('figure2_23.png',dpi=500); evidence()
About the EVAL function, the EVAL function is a very easy to employ functions that can perform cord expressions. refer toPython Eval () function Run into hither is plenty _127.0.0.i weblog-CSDN blog _eval () function
2. Multiple graphs are displayed in a graphics screen
Example 2.38 Draw y = sin (x), y = COS (x ^ ii), X belongs to [0, 2PAI] graphics in the same graphical interface, respectively
import numpy every bit np from matplotlib.pyplot import * x=np.linspace(0,two*np.pi,200) #NUMPY's LINSPACE function, create an equal deviation cavalcade; usage: linSpace (x1, x2, n), where x1, x2, n are starting value, abort value, and elements. If the default betoken is 100 if Due north, the default signal is 100 # # y1=np.sin(x); y2=np.cos(pow(x,two)) RC ('font', size = 16); RC ('text', usex = truthful) # Phone call TEX Fontal Plot (x, y1, 'r', label = '$ sin (10) $', linewidth = 2) #LATEX format display formula plot(x,y2,'b--',characterization='$cos(x^2)$') xlabel('$ten$'); ylabel('$y$',rotation=0) savefig('figure2_38.png',dpi=500); legend(); evidence()
iii. Multiple graphs separately
Example 2.39 open the screen into 3 sub-windows, two sub-windows higher up, i big sub-window, three sub-windows, respectively, Y = sin (ten), y = cos (x), y = sin (x ^ 2), X belongs to [0,2pi]
import numpy equally np from matplotlib.pyplot import * Ten = np.linspace (0, 2 * np.pi, 200) # 0 to 2pi take 200 equal difference columns y1=np.sin(x); y2=np.cos(x); y3=np.sin(x*x) RC ('font', size = 16); RC ('text', usex = true) # Phone call TEX Fontal AX1 = Subplot (2, two, one) # Newly congenital the top 1 sub-window Ax1.plot (ten, y1, 'r', label = '$ sin (10) $') #Draw Legend () # Add together legend AX2 = Subplot (ii, ii, 2) # Newly built the correct number 2 sub-window ax2.plot(x,y2,'b--',label='$cos(ten)$'); fable() AX3 = Subplot (2, ane, 2) # Newly built two rows, 1 column below ax3.plot(x,y3,'k--',label='$sin(x^two)$'); legend(); savefig('figure2_39.png',dpi=500); show()
four. Curve of iii-dimensional space
Case ii.40 Picture of 3-dimensional curve x = tsint, y = tcost, z = t (T belongs to [0,100]) graphics
from mpl_toolkits import mplot3d import matplotlib.pyplot as plt import numpy as np AX = plt.axes (Projection = '3D') # Prepare 3D graphics style z=np.linspace(0, 100, 1000) 10=np.sin(z)*z; y=np.cos(z)*z ax.plot3D(10, y, z, 'k') plt.savefig('figure2_40.png',dpi=500); plt.prove()
v. Three-dimensional surface graphics
Example 2.41 Picture 3D surface3D surface pattern and 3D mesh graphics
from mpl_toolkits import mplot3d import matplotlib.pyplot as plt import numpy as np x=np.linspace(-6,half-dozen,30) y=np.linspace(-6,6,30) Ten, y = np.meshgrid (x, y) # generates a filigree betoken coordinate matrix, generating three-dimensional surface graphics must exist prepared Z= np.sin(np.sqrt(X ** two + Y ** two)) ax1=plt.subplot(1,2,1,project='3d') ax1.plot_surface(10, Y, Z,cmap='viridis') ax1.set_xlabel('x'); ax1.set_ylabel('y'); ax1.set_zlabel('z') ax2=plt.subplot(1,2,ii,projection='3d'); AX2.Plot_wireFrame (x, y, z, color = 'c') # 3D Surface surface drawing of a function, tin also be changed to PLOT_SURFACE ax2.set_xlabel('ten'); ax2.set_ylabel('y'); ax2.set_zlabel('z') plt.savefig('figure2_41.png',dpi=500); plt.show()
6. Contour map
Instance ii.42 Known Plane Region 0 <= x <= 1400, 0 <= Y <= 1200, the grid node elevation data of the footstep size is 100 is shown in the table.
Picture of the profile line in the area
2. Picture of three-scrap face diagram in this expanse
from mpl_toolkits import mplot3d import matplotlib.pyplot as plt import numpy as np z = np.loadtxt ("PDATA2_42.txt") # loading elevation data X = np.arange (0,1500,100) # can take the front, so 0 tin be written Y = np.arange (1200, -10, -100) # can non exist taken behind, so 0 to exist replaced with -10 Contr = plt.contour (x, y, z); plt.clabel (contr) #Draw profile and label, remember plt.xlabel('$10$'); plt.ylabel('$y$',rotation=0) plt.savefig('figure2_42_1.png',dpi=500) Plt.figure () # Create a drawing object AX = plt.axes (Projection = '3D') # Create a three-dimensional coordinate centrality object with this drawing object X, y = np.meshgrid (x, y) # generates a grid signal coordinate matrix, generating three-dimensional surface graphics must exist prepared Ax.plot_surface (10, y, z, cmap = 'viridis') # 3D Surface surface drawing of a function, or Plot_WireFrame ax.set_xlabel('ten'); ax.set_ylabel('y'); ax.set_zlabel('z') plt.savefig('figure2_42_2.png',dpi=500); plt.show()
You tin also put information technology on a sheet:
from mpl_toolkits import mplot3d import matplotlib.pyplot as plt import numpy as np z=np.loadtxt("Pdata2_42.txt") x=np.arange(0,1500,100) y=np.arange(1200,-10,-100) AX1 = plt.subplot (one, ii, 1) # allocates the discussion space, accept already learned the front contr=plt.contour(ten,y,z); plt.clabel(contr) plt.xlabel('$ten$'); plt.ylabel('$y$',rotation=0) AX2 = plt.subplot (1, 2, two, projection = '3D') # allocated the give-and-take space, before you have learned plt.sca(ax2) Ten,Y=np.meshgrid(x,y) ax2.plot_surface(X, Y, z,cmap='viridis') ax2.set_xlabel('x'); ax2.set_ylabel('y'); ax2.set_zlabel('z') plt.savefig('figure2_42_2.png',dpi=500); plt.show()
vii. Vector illustration (arrow map)
Example 2.43 Painting Speed Vector (U, V) = (Ycosx, Ysinx) vector field
import matplotlib.pyplot as plt from numpy import * ten=linspace(0,15,11); y=linspace(0,10,12) Ten, Y = Meshgrid (ten, y) # Generate grid data v1=y*cos(x); v2=y*sin(x) Plt.quider (X, Y, V1, V2) #quiver pointer diagram plotting, indicating that the gradient changes are very useful plt.savefig('figure2_43.png',dpi=500); plt.evidence()
2.four.3 Integrated application of visualization
Example 2.44 Y1 = sin (x), y2 = COS (X), Y3 = SiN (x ^ ii), Y4 = Xsinx, X belongs to the combination map of [0, 2pi].
import numpy as np from matplotlib.pyplot import * 10=np.linspace(0,2*np.pi,200) y1=np.sin(x); y2=np.cos(ten); y3=np.sin(ten*10); y4=x*np.sin(x) RC ('font', size = 16); RC ('text', usex = true) # Telephone call TEX Fontal AX1 = Subplot (2, three, i) # Newly congenital the pinnacle ane sub-window Ax1.plot (x, y1, 'r', label = '$ sin (x) $') #Draw Fable () # Add together legend AX2 = Subplot (two, 3, two) # Newly built two sub-window ax2.plot(x,y2,'b--',label='$cos(10)$'); legend() AX3 = SUBPLOT (2, iii, (3, 6)) # 3,six sub-windows combined ax3.plot(ten,y3,'1000--',label='$sin(x^2)$'); legend() AX4 = SUBPLOT (two, 3, (4, 5)) # four, 5 sub-window combined ax4.plot(ten,y4,'thousand--',label='$xsin(x)$'); legend() savefig('figure2_44.png',dpi=500); show()
Example 2.45 gives 8568 data of a article transaction (file name trade.xlsx, data), and the format shown in Figure 2.thirteen, the format is visualized in Effigy two.13. .
(Temporarily skip, code error)
2.5 SCIPY.STATS Module Introduction
ii.5.1 Random variables and distributions
You tin can generate more specific probability density, distribution functions, etc., y'all tin also do simple statistical analysis.
Continuous random variables and distributions
Continuous random variable objects have the following methods.
- RVS: Generate random numbers, can specify the size of the array of outputs via the Size parameter.
- PDF: Probability Density Function of Random Variables.
- CDF: Distribution function of random variables.
- SF: Random variable The living function, its value is 1 cdf.ppf: distributed function.
- Stat: Calculate the expectations and variance of random variables.
- Fit: The general likelihood estimation method is used to estimate the overall unknown parameters for a fix of random samples.
Probability density function for mutual continuous random variables
The master functions corresponding to the normal distribution are as follows
2. Discrete random variables and distributions
The commonly used discrete variable distribution law role is shown in the following table
2.5.2 Probability density office and distribution police visualization
definition:Function distribution definition
Nature:
Phone call format:
gamma.pdf(ten,a,loc=0,scale=one)
Hither, a =,b=(The default is 1)
Instance 2.46 4 dissimilar in a graphical interface Distributed probability density bend
from matplotlib.pyplot import plot, fable, xlabel, ylabel, savefig, show, rc from scipy.stats import gamma from numpy import linspace ten=linspace(0,15,100); rc('font',size=15); rc('text', usetex=True) plot(x,gamma.pdf(x,four,0,two),'r*-',label="$\\alpha=4, \\beta=2$") plot(x,gamma.pdf(x,4,0,i),'bp-',characterization="$\\alpha=4, \\beta=1$") plot(x,gamma.pdf(ten,four,0,0.5),'.k-',characterization="$\\blastoff=4, \\beta=0.5$") plot(x,gamma.pdf(x,two,0,0.5),'>g-',label="$\\blastoff=two, \\beta=0.5$") legend(); xlabel('$ten$'); ylabel('$f(ten)$') savefig("figure2_46.png",dpi=500); show()
Example 2.47 Painting iv different normal distribution density functions in 4 windows
import matplotlib.pyplot as plt import numpy as np from scipy.stats import norm mu0 = [-i, 0]; s0 = [0.v, ane] x = np.linspace(-7, vii, 100); plt.rc('font',size=15) plt.rc('text', usetex=True); plt.rc('axes',unicode_minus=False) f, ax = plt.subplots(len(mu0), len(s0), sharex=True, sharey=True) for i in range(2): for j in range(2): mu = mu0[i]; south = s0[j] y = norm(mu, s).pdf(ten) ax[i,j].plot(x, y) ax[i,j].plot(1,0,label="$\\mu$ = {:3.2f}\n$\\sigma$ = {:three.2f}".format(mu,s)) AX [i, j] .fable (fontsize = 12) # ax[1,1].set_xlabel('$x$') ax[0,0].set_ylabel('pdf($x$)') plt.savefig('figure2_47.png'); plt.testify()
Example 2.48 Random variables x ~ b (northward, p) (two distribution), the distribution law of 10 is
Movie of two distribution B (5, 0.4) distribution constabulary "lucifer rod" map
from scipy.stats import binom import matplotlib.pyplot every bit plt import numpy every bit np north, p=5, 0.4 x = np.arange (6); y = binoM.PMF (x, n, p) # plt.subplot(121); plt.plot(x, y, 'ro') Plt.Vlines (ten, 0, y, 'g', lw = 3, alpha = 0.5) #VLINES (X, Ymin, Ymax) portrait #LW Ready the line width, the transparency of the alpha setting diagram plt.subplot(122); plt.stem(x, y, use_line_collection=True) plt.savefig("figure2_48.png", dpi=500); plt.show()
Source: https://programmersought.com/article/316910517957/
Post a Comment for "Read Chapter 4 Vocab and Notes From Chapter 5"