Count Number of Rows Where the Value of Two Columns Are Not Equal + R

Prerequisite: Pandas.Dataframes in Python

The rows of a dataframe can comprise selected based on conditions as we do use of goods and services the SQL queries. The assorted methods to attain this is explained in this article with examples. To explain the method acting a dataset has been created which contains information of points scored by 10 people in various games. The dataset is unexploded into the dataframe and envisioned first. Ten people with unique player id(Pid) have played assorted games with different gage id(game_id) and the points scored in each game is added Eastern Samoa an entry to the table. Some of the player's points are non recorded and thus Nan River value appears in the table.

 Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.

To get with, your interview preparations Raise your Data Structures concepts with the Python DS Course. And earlier your Political machine Learning Journey, join the Machine Learning - Basic Level Course

Note: To get the CSV file used, click Hera.



Python3

consequence pandas American Samoa palladium

df = pd.read_csv(r "__your file path__\example2.csv" )

print (df)

Output:

dataset example2.csv

Mathematician Indexing method

In this method acting, for a specified column status, for each one row is checked for true/false. The rows which cede Truthful will be reasoned for the outturn. This can constitute achieved in various ways. The interrogation put-upon is Select rows where the column Pid='p01′

Example 1: Checking condition while indexing

Python3

df_new = df[df[ 'Pid' ] = = 'p01' ]

black and white (df_new)

Output

Exercise 2: Specifying the discipline 'mask' adaptable


The elect rows are assigned to a new dataframe with the exponent of rows from old dataframe as an index in the current matchless and the columns remaining the same.

Python3

mask = df[ 'Pelvic inflammatory disease' ] = = 'p01'

df_new = pd.DataFrame(df[mask])

print (df_new)

Output

Example 3: Combining mask and dataframes.values property

The query here is Select the rows with game_id 'g21'.

Python3

mask = df[ 'game_id' ].values = = 'g21'

df_new = df[mask]

impress (df_new)

Output

Positional indexing

The methods loc() and iloc() can be used for slicing the dataframes in Python. Among the differences between loc() and iloc(), the principal affair to be famous is iloc() takes exclusive integer indices, patc loc() can payoff ahead boolean indices also.


Representative 1: Victimisation loc()

The mask gives the boolean value As an index for from each one row and whichever rows pass judgment to true will appear in the result. Hither, the interrogation is to select the rows where game_id is g21.

Python3

mask = df[ 'game_id' ].values = = 'g21'

df_new = df.loc[mask]

publish (df_new)

Output

Example 2: Using iloc()

The enquiry is the same as the one taken above. The iloc() takes merely integers as an disceptation and thus, the cloak array is passed as a parameter to the numpy's flatnonzero() function that returns the index in the list where the treasure is not zero (false)

Python3

mask = df[ 'game_id' ].values = = 'g21'

impress ( "Mask array :" , mask)

pos = np.flatnonzero(block out)

impress ( "\nRows selected :" , pos)

df.iloc[pos]

Turnout



Using dataframe.query()

The query() method takes up the expression that returns a boolean value, processes all the rows in the dataframe, and returns the resultant dataframe with selected rows.

Example 1: Superior  rows where list="Albert"

Python3

df.query( 'bring up=="Albert"' )

Output

Exemplar 2: Select rows where points>50 and the player is not Albert.

This example is to demonstrate that logical operators corresponding AND/OR crapper be used to check multiple conditions.

Python3

df.query( 'points>50 & name!="Albert"' )

Output



Using isin()

This method of dataframe takes up an iterable or a serial publication or another dataframe arsenic a parameter and checks whether elements of the dataframe exists in it. The rows whichever evaluates to true are advised for the resulting.

Good example 1: Select the rows where players are Albert, Louis, and Can.

Python3

li = [ 'Albert' , 'Louis' , 'Lavatory' ]

df[df.name.isin(li)]

Output

Example 2: Select rows where points>50 and players are not Albert, Louis and John.

The tiled symbol (~) provides the negation of the expression evaluated.

Python3

fifty-one = [ 'Albert' , 'Louis' , 'Toilet' ]

df[(df.points > 50 ) & (~df.name.isin(li))]

End product



Victimization np.where()

The numpy's where() serve can be combined with the pandas' isin() function to green goods a faster result. The numpy.where() is proved to produce results faster than the normal methods utilized above.

Example:

Python3

import numpy as nurse practitioner

df_new = df.iloc[np.where(df.name.isin(51))]

Output

Comparison with former methods

Python3

import numpy as Np

% % timeit

df_new = df.iloc[np.where(df.name.isin(li))]

Yield:

756 µs ± 132 µs per loop (mingy ± std. dev. of 7 runs, 1000 loops each)

Python3

% % timeit

li = [ 'Albert' , 'Louis' , 'John' ]

df[(df.points> 50 )&ere;(~df.appoint.isin(atomic number 3))]

Output

1.7 disseminated sclerosis ± 307 µs per loop-the-loop (average ± std. dev. of 7 runs, 1000 loops each)


Count Number of Rows Where the Value of Two Columns Are Not Equal + R

Source: https://www.geeksforgeeks.org/how-to-select-rows-from-a-dataframe-based-on-column-values/

0 Response to "Count Number of Rows Where the Value of Two Columns Are Not Equal + R"

Postar um comentário

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel