Creating A Pandas Df Code Example

Snippet 1
  >>> d = {'col1': [1, 2], 'col2': [3, 4]}
>>> df = pd.DataFrame(data=d)
>>> df
   col1  col2
0     1     3
1     2     4
 
Snippet 2
  >>> df2 = pd.DataFrame(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
...                    columns=['a', 'b', 'c'])
>>> df2
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9
 
Snippet 3
  # Creates a DataFrame

l = [('Alice', 1)]
spark.createDataFrame(l).collect()
# [Row(_1=u'Alice', _2=1)]
spark.createDataFrame(l, ['name', 'age']).collect()
# [Row(name=u'Alice', age=1)]

d = [{'name': 'Alice', 'age': 1}]
spark.createDataFrame(d).collect()
# [Row(age=1, name=u'Alice')]

rdd = sc.parallelize(l)
spark.createDataFrame(rdd).collect()
# [Row(_1=u'Alice', _2=1)]
df = spark.createDataFrame(rdd, ['name', 'age'])
df.collect()
# [Row(name=u'Alice', age=1)]

from pyspark.sql import Row
Person = Row('name', 'age')
person = rdd.map(lambda r: Person(*r))
df2 = spark.createDataFrame(person)
df2.collect()
# [Row(name=u'Alice', age=1)]

from pyspark.sql.types import *
schema = StructType([
  StructField("name", StringType(), True),
  StructField("age", IntegerType(), True)])
df3 = spark.createDataFrame(rdd, schema)
df3.collect()
# [Row(name=u'Alice', age=1)]

spark.createDataFrame(df.toPandas()).collect()
# [Row(name=u'Alice', age=1)]
spark.createDataFrame(pandas.DataFrame([[1, 2]])).collect()
# [Row(0=1, 1=2)]

spark.createDataFrame(rdd, "a: string, b: int").collect()
# [Row(name=u'Alice', age=1)]
spark.createDataFrame(pandas.DataFrame([[1, 2]])).collect()
# [Row(0=1, 1=2)]

spark.createDataFrame(rdd, "a: string, b: int").collect()
# [Row(a=u'Alice', b=1)]
rdd = rdd.map(lambda row: row[1])
spark.createDataFrame(rdd, "int").collect()
[Row(value=1)]
spark.createDataFrame(rdd, "boolean").collect()
# Traceback (most recent call last):
#     ...
# Py4JJavaError: ... 

Similar Snippets


Creating A Pandas Df Code Example - pandas

Pandas Iterrows Code Example - pandas

Slicing In Pandas Code Example - pandas

Pandas Count Rows With Value Code Example - pandas

Add An Index Column Pandas Code Example - pandas

Pandas Ttable With Sum Totals Code Example - pandas

Remove 1st Column Pandas Code Example - pandas

Pandas Dataframe Froms String Code Example - pandas

Combine Two Dataframe In Pandas Code Example - pandas

Str_Count On Pandas Series Code Example - pandas

Convert Pandas Data Frame To Latex File Code Example - pandas

Append New Data On Pandas Dataframe Code Example - pandas

Getting Dummies And Input Them To Pandas Dataframe Code Example - pandas

How To Drop Columns In Pandas Code Example - pandas

Pandas Select All Columns Except One Code Example - pandas

How To Change Column Name In Pandas Code Example - pandas

Index Max Pandas Code Example - pandas

Rename Row Pandas Code Example - pandas

Dictionary To A Dataframe Pandas Arrays Must All Be Same Length Code Example - pandas

Pandas Cumulative Sum Column Code Example - pandas

Pandas Read From Website Code Example - pandas

Pandas Show Top 10 Rows Code Example - pandas

How To Merge Two Column Pandas Code Example - pandas

How To Set Pandas Dataframe As Global Code Example - pandas

Pandas Merge Two Columns From Different Dataframes Code Example - pandas

Pandas Read Excel With Two Headers Code Example - pandas

Pandas Frame Convert String Code Example - pandas

Onehot Encode List Of Columns Pandas Code Example - pandas

Reshape Wide To Long In Pandas Code Example - pandas