如何在两个pandas dataframe里找出某一列或者几列的值相同的行?
s1 = pd.merge(df1, df2, how='inner', on=['userId', 'movieId'])
这样就会在df1和df2中找出userId
和movieId
这两列值相同的行,然后合并。
df1: userId
, movieId
, rating
df2: userId
, movieId
, tag
s1: userId
, movieId
, rating
, tag
how : {‘left’, ‘right’, ‘outer’, ‘inner’}, default ‘inner’
- left: use only keys from left frame (SQL: left outer join)
- right: use only keys from right frame (SQL: right outer join)
- outer: use union of keys from both frames (SQL: full outer join)
- inner: use intersection of keys from both frames (SQL: inner join)
on : label or list Field names to join on. Must be found in both DataFrames. If on is None and not merging on indexes, then it merges on the intersection of the columns by default.