Count Number of Rows Where the Value of Two Columns Are Not Equal + R
Prerequisite: Pandas.Dataframes in Python
The rows of a dataframe can comprise selected based on conditions as we do use of goods and services the SQL queries. The assorted methods to attain this is explained in this article with examples. To explain the method acting a dataset has been created which contains information of points scored by 10 people in various games. The dataset is unexploded into the dataframe and envisioned first. Ten people with unique player id(Pid) have played assorted games with different gage id(game_id) and the points scored in each game is added Eastern Samoa an entry to the table. Some of the player's points are non recorded and thus Nan River value appears in the table.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To get with, your interview preparations Raise your Data Structures concepts with the Python DS Course. And earlier your Political machine Learning Journey, join the Machine Learning - Basic Level Course
Note: To get the CSV file used, click Hera.
Python3
consequence pandas American Samoa palladium
df = pd.read_csv(r "__your file path__\example2.csv" )
print (df)
Output:
dataset example2.csv
Mathematician Indexing method
In this method acting, for a specified column status, for each one row is checked for true/false. The rows which cede Truthful will be reasoned for the outturn. This can constitute achieved in various ways. The interrogation put-upon is Select rows where the column Pid='p01′
Example 1: Checking condition while indexing
Python3
df_new = df[df[ 'Pid' ] = = 'p01' ]
black and white (df_new)
Output
Exercise 2: Specifying the discipline 'mask' adaptable
The elect rows are assigned to a new dataframe with the exponent of rows from old dataframe as an index in the current matchless and the columns remaining the same.
Python3
mask = df[ 'Pelvic inflammatory disease' ] = = 'p01'
df_new = pd.DataFrame(df[mask])
print (df_new)
Output
Example 3: Combining mask and dataframes.values property
The query here is Select the rows with game_id 'g21'.
Python3
mask = df[ 'game_id' ].values = = 'g21'
df_new = df[mask]
impress (df_new)
Output
Positional indexing
The methods loc() and iloc() can be used for slicing the dataframes in Python. Among the differences between loc() and iloc(), the principal affair to be famous is iloc() takes exclusive integer indices, patc loc() can payoff ahead boolean indices also.
Representative 1: Victimisation loc()
The mask gives the boolean value As an index for from each one row and whichever rows pass judgment to true will appear in the result. Hither, the interrogation is to select the rows where game_id is g21.
Python3
mask = df[ 'game_id' ].values = = 'g21'
df_new = df.loc[mask]
publish (df_new)
Output
Example 2: Using iloc()
The enquiry is the same as the one taken above. The iloc() takes merely integers as an disceptation and thus, the cloak array is passed as a parameter to the numpy's flatnonzero() function that returns the index in the list where the treasure is not zero (false)
Python3
mask = df[ 'game_id' ].values = = 'g21'
impress ( "Mask array :" , mask)
pos = np.flatnonzero(block out)
impress ( "\nRows selected :" , pos)
df.iloc[pos]
Turnout
Using dataframe.query()
The query() method takes up the expression that returns a boolean value, processes all the rows in the dataframe, and returns the resultant dataframe with selected rows.
Example 1: Superior rows where list="Albert"
Python3
df.query( 'bring up=="Albert"' )
Output
Exemplar 2: Select rows where points>50 and the player is not Albert.
This example is to demonstrate that logical operators corresponding AND/OR crapper be used to check multiple conditions.
Python3
df.query( 'points>50 & name!="Albert"' )
Output
Using isin()
This method of dataframe takes up an iterable or a serial publication or another dataframe arsenic a parameter and checks whether elements of the dataframe exists in it. The rows whichever evaluates to true are advised for the resulting.
Good example 1: Select the rows where players are Albert, Louis, and Can.
Python3
li = [ 'Albert' , 'Louis' , 'Lavatory' ]
df[df.name.isin(li)]
Output
Example 2: Select rows where points>50 and players are not Albert, Louis and John.
The tiled symbol (~) provides the negation of the expression evaluated.
Python3
fifty-one = [ 'Albert' , 'Louis' , 'Toilet' ]
df[(df.points > 50 ) & (~df.name.isin(li))]
End product
Victimization np.where()
The numpy's where() serve can be combined with the pandas' isin() function to green goods a faster result. The numpy.where() is proved to produce results faster than the normal methods utilized above.
Example:
Python3
import numpy as nurse practitioner
df_new = df.iloc[np.where(df.name.isin(51))]
Output
Comparison with former methods
Python3
import numpy as Np
% % timeit
df_new = df.iloc[np.where(df.name.isin(li))]
Yield:
756 µs ± 132 µs per loop (mingy ± std. dev. of 7 runs, 1000 loops each)
Python3
% % timeit
li = [ 'Albert' , 'Louis' , 'John' ]
df[(df.points> 50 )&ere;(~df.appoint.isin(atomic number 3))]
Output
1.7 disseminated sclerosis ± 307 µs per loop-the-loop (average ± std. dev. of 7 runs, 1000 loops each)
Count Number of Rows Where the Value of Two Columns Are Not Equal + R
Source: https://www.geeksforgeeks.org/how-to-select-rows-from-a-dataframe-based-on-column-values/
0 Response to "Count Number of Rows Where the Value of Two Columns Are Not Equal + R"
Postar um comentário