Danfo.js
  • Danfo.js Documentation
  • Getting Started
  • API reference
    • General Functions
      • danfo.tensorflow
      • danfo. convertFunctionTotransformer
      • danfo.streamCsvTransformer
      • danfo.streamJSON
      • danfo.streamCSV
      • danfo.Utils
      • danfo.Str
      • danfo.Dt
      • danfo.dateRange
      • danfo.OneHotEncoder
      • danfo.StandardScaler
      • danfo.MinMaxScaler
      • danfo.LabelEncoder
      • danfo.toDateTime
      • danfo.getDummies
      • danfo.concat
      • danfo.merge
    • Input/Output
      • danfo.readExcel
      • danfo.toExcel
      • danfo.readJSON
      • danfo.toJSON
      • danfo.readCSV
      • danfo.toCSV
    • Series
      • Creating a Series
      • Series.append
      • Series.cumSum
      • Series.cumMax
      • Series.cumProd
      • Series.cumMin
      • Series.str.split
      • Series.str.len
      • Series.str.join
      • Series.str.trim
      • Series.str.substring
      • Series.str.substr
      • Series.str.slice
      • Series.str.search
      • Series.str.repeat
      • Series.str.replace
      • Series.str.lastIndexOf
      • Series.str.indexOf
      • Series.str.includes
      • Series.str.endsWith
      • Series.str.startsWith
      • Series.str.concat
      • Series.str.charAt
      • Series.str.toUpperCase
      • Series.str.toLowerCase
      • Series.str.capitalize
      • Series.dt.seconds
      • Series.dt.minutes
      • Series.dt.dayOfMonth
      • Series.dt.monthName
      • Series.dt.hours
      • Series.dt.dayOfWeek
      • Series.dt.dayOfWeek
      • Series.dt.month
      • Series.dt.year
      • Series.argMax
      • Series.argMin
      • Series.argSort
      • Series.replace
      • Series.isNa
      • Series.fillNa
      • Series.dropNa
      • Series.dropDuplicates
      • Series.valueCounts
      • Series.nUnique
      • Series.unique
      • Series.abs
      • Series.ne
      • Series.eq
      • Series.ge
      • Series.le
      • Series.gt
      • Series.lt
      • Series.iloc
      • Series.loc
      • Series.at
      • Series.iat
      • Series.ndim
      • Series.shape
      • Series.dtype
      • Series.values
      • Series.tensor
      • Series.index
      • Series.apply
      • Series.map
      • Series.setIndex
      • Series.resetIndex
      • Series.describe
      • Series.copy
      • Series.sortValues
      • Series.var
      • Series.std
      • Series.round
      • Series.minimum
      • Series.maximum
      • Series.count
      • Series.sum
      • Series.max
      • Series.min
      • Series.mode
      • Series.median
      • Series.mean
      • Series.mod
      • Series.pow
      • Series.div
      • Series.mul
      • Series.sub
      • Series.add
      • Series.sample
      • Series.tail
      • Series.head
      • Series.and
      • Series.or
    • Dataframe
      • Creating a DataFrame
      • DataFrame.sortIndex
      • DataFrame.append
      • DataFrame.nUnique
      • DataFrame.tensor
      • DataFrame.print
      • DataFrame.toCSV
      • DataFrame.toJSON
      • DataFrame.toExcel
      • DataFrame.sortValues
      • DataFrame.setIndex
      • DataFrame.resetIndex
      • DataFrame.rename
      • DataFrame.drop
      • DataFrame.asType
      • DataFrame.shape
      • DataFrame.axis
      • DataFrame.ndim
      • DataFrame.values
      • DataFrame.selectDtypes
      • DataFrame.ctypes
      • DataFrame.index
      • DataFrame.loc
      • DataFrame.iloc
      • DataFrame.at
      • DataFrame.iat
      • DataFrame.head
      • DataFrame.tail
      • DataFrame.sample
      • DataFrame.add
      • DataFrame.sub
      • DataFrame.mul
      • DataFrame.div
      • DataFrame.pow
      • DataFrame.mod
      • DataFrame.mean
      • DataFrame.median
      • DataFrame.min
      • DataFrame.max
      • DataFrame.std
      • DataFrame.var
      • DataFrame.count
      • DataFrame.round
      • DataFrame.cumSum
      • DataFrame.cumMin
      • DataFrame.cumMax
      • DataFrame.cumProd
      • DataFrame.copy
      • DataFrame.describe
      • DataFrame.sum
      • DataFrame.abs
      • DataFrame.query
      • DataFrame.addColumn
      • DataFrame.groupby
      • DataFrame.column
      • DataFrame.fillNa
      • DataFrame.isNa
      • DataFrame.dropNa
      • DataFrame.apply
      • DataFrame.applyMap
      • DataFrame.It
      • DataFrame.gt
      • DataFrame.le
      • DataFrame.ge
      • DataFrame.ne
      • DataFrame.eq
      • DataFrame.replace
    • Configuration Options
    • Plotting
      • Timeseries Plots
      • Violin Plots
      • Box Plots
      • Tables
      • Pie Charts
      • Histograms
      • Scatter Plots
      • Bar Charts
      • Line Charts
      • Customizing your plots
    • Groupby
      • Groupby.getGroups
      • Groupby.col
      • Groupby.max
      • Groupby.min
      • Groupby.sum
      • Groupby.mean
      • Groupby.std
      • Groupby.var
      • Groupby.count
      • Groupby.cumSum
      • Groupby.cumMax
      • Groupby.cumMin
      • Groupby.cumProd
      • Groupby.agg
  • User Guides
    • Migrating to the stable version of Danfo.js
    • Using Danfojs in React
    • Titanic Survival Prediction using Danfo.js and Tensorflow.js
  • Building Data Driven Applications with Danfo.js - Book
  • Contributing Guide
  • Release Notes
Powered by GitBook
On this page
  • Creating a DataFrame from a JSON object:
  • Creating a DataFrame from an array of array
  • Creating a DataFrame from a 2D tensor
  • Creating a DataFrame from an object
  • Creating a DataFrame and specifying index, dtypes, columns
  • Creating a DataFrame and specifying memory mode

Was this helpful?

  1. API reference
  2. Dataframe

Creating a DataFrame

Creates a DataFrame object from flat structure

danfo.DataFrame(data, options)

Parameters
Type
Description

data

2D Array, 2D Tensor, JSON object.

Flat data structure to load into DataFrame

options

Object

Optional configuration object. Supported properties are:

index: Array of numeric or string names for subseting array. If not specified, indexes are auto-generated.

columns: Array of column names. If not specified, column names are auto generated.

dtypes: Array of data types for each the column. If not specified, dtypes are/is inferred.

config: General configuration object for extending or setting NDframe behavior. See full options here

In order to create a DataFrame, you need to call the new Keyword and pass in a flat data structure. In the following examples, we show you how to create DataFrames by specifying different config options.

Creating a DataFrame from a JSON object:

const dfd = require("danfojs-node")


json_data = [{ A: 0.4612, B: 4.28283, C: -1.509, D: -1.1352 },
            { A: 0.5112, B: -0.22863, C: -3.39059, D: 1.1632 },
            { A: 0.6911, B: -0.82863, C: -1.5059, D: 2.1352 },
            { A: 0.4692, B: -1.28863, C: 4.5059, D: 4.1632 }]

df = new dfd.DataFrame(json_data)
df.print()
<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
     <!--danfojs CDN -->
<script src="https://cdn.jsdelivr.net/npm/danfojs@1.2.0/lib/bundle.min.js"></script>    <title>Document</title>
</head>

<body>

    <script>

         json_data = [{ A: 0.4612, B: 4.28283, C: -1.509, D: -1.1352 },
            { A: 0.5112, B: -0.22863, C: -3.39059, D: 1.1632 },
            { A: 0.6911, B: -0.82863, C: -1.5059, D: 2.1352 },
            { A: 0.4692, B: -1.28863, C: 4.5059, D: 4.1632 }]

        df = new dfd.DataFrame(json_data)
        df.print()

    </script>
</body>

</html>

Creating a DataFrame from an array of array

const dfd = require("danfojs-node")

let arr = [[12, 34, 2.2, 2], [30, 30, 2.1, 7]]
let df = new dfd.DataFrame(arr, {columns: ["A", "B", "C", "D"]})
df.print()
<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
     <!--danfojs CDN -->
<script src="https://cdn.jsdelivr.net/npm/danfojs@1.2.0/lib/bundle.min.js"></script>    <title>Document</title>
</head>

<body>

    <script>

         json_data = [{ A: 0.4612, B: 4.28283, C: -1.509, D: -1.1352 },
            { A: 0.5112, B: -0.22863, C: -3.39059, D: 1.1632 },
            { A: 0.6911, B: -0.82863, C: -1.5059, D: 2.1352 },
            { A: 0.4692, B: -1.28863, C: 4.5059, D: 4.1632 }]

        df = new dfd.DataFrame(json_data)
        df.print()

    </script>
</body>

</html>
╔════════════╤═══════════════════╤═══════════════════╤═══════════════════╤═══════════════════╗
║            │ A                 │ B                 │ C                 │ D                 ║
╟────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 0          │ 12                │ 34                │ 2.2               │ 2                 ║
╟────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 1          │ 30                │ 30                │ 2.1               │ 7                 ║
╚════════════╧═══════════════════╧═══════════════════╧═══════════════════╧═══════════════════╝

Creating a DataFrame from a 2D tensor

const dfd = require("danfojs-node")
const tf = dfd.tensorflow

let tensor_arr = tf.tensor2d([[12, 34, 2.2, 2], [30, 30, 2.1, 7]])
let df = new dfd.DataFrame(tensor_arr, {columns: ["A", "B", "C", "D"]})
df.print()
df.ctypes.print()
<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
     <!--danfojs CDN -->
<script src="https://cdn.jsdelivr.net/npm/danfojs@1.2.0/lib/bundle.min.js"></script>    <title>Document</title>
</head>

<body>

    <script>

         json_data = [{ A: 0.4612, B: 4.28283, C: -1.509, D: -1.1352 },
            { A: 0.5112, B: -0.22863, C: -3.39059, D: 1.1632 },
            { A: 0.6911, B: -0.82863, C: -1.5059, D: 2.1352 },
            { A: 0.4692, B: -1.28863, C: 4.5059, D: 4.1632 }]

        df = new dfd.DataFrame(json_data)
        df.print()

    </script>
</body>

</html>
╔═══╤═══════════════════╤═══════════════════╤═══════════════════╤═══════════════════╗
║   │ A                 │ B                 │ C                 │ D                 ║
╟───┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 0 │ 12                │ 34                │ 2.20000004768...  │ 2                 ║
╟───┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 1 │ 30                │ 30                │ 2.09999990463...  │ 7                 ║
╚═══╧═══════════════════╧═══════════════════╧═══════════════════╧═══════════════════╝

╔═══╤══════════════════════╗
║   │ 0                    ║
╟───┼──────────────────────╢
║ A │ int32                ║
╟───┼──────────────────────╢
║ B │ int32                ║
╟───┼──────────────────────╢
║ C │ float32              ║
╟───┼──────────────────────╢
║ D │ int32                ║
╚═══╧══════════════════════╝

Creating a DataFrame from an object

const dfd = require("danfojs-node")


dates = new dfd.dateRange({ start: '2017-01-01', end: "2020-01-01", period: 4, freq: "Y" })

console.log(dates);

obj_data = {'A': dates,
            'B': ["bval1", "bval2", "bval3", "bval4"],
            'C': [10, 20, 30, 40],
            'D': [1.2, 3.45, 60.1, 45],
            'E': ["test", "train", "test", "train"]
            }

df = new dfd.DataFrame(obj_data)
df.print()
<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
     <!--danfojs CDN -->
<script src="https://cdn.jsdelivr.net/npm/danfojs@1.2.0/lib/bundle.min.js"></script>    <title>Document</title>
</head>

<body>

    <script>

        dates = new dfd.date_range({ start: '2017-01-01', end: "2020-01-01", period: 4, freq: "Y" })

        console.log(dates);

        obj_data = {'A': dates,
                    'B': ["bval1", "bval2", "bval3", "bval4"],
                    'C': [10, 20, 30, 40],
                    'D': [1.2, 3.45, 60.1, 45],
                    'E': ["test", "train", "test", "train"]
                    }

        df = new dfd.DataFrame(obj_data)
        df.print()

    </script>
</body>

</html>
//output in console
╔═══╤═══════════════════╤═══════════════════╤═══════════════════╤═══════════════════╤═══════════════════╗
║   │ A                 │ B                 │ C                 │ D                 │ E                 ║
╟───┼───────────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 0 │ 1/1/2017, 1:0...  │ bval1             │ 10                │ 1.2               │ test              ║
╟───┼───────────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 1 │ 1/1/2018, 1:0...  │ bval2             │ 20                │ 3.45              │ train             ║
╟───┼───────────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 2 │ 1/1/2019, 1:0...  │ bval3             │ 30                │ 60.1              │ test              ║
╟───┼───────────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ 3 │ 1/1/2020, 1:0...  │ bval4             │ 40                │ 45                │ train             ║
╚═══╧═══════════════════╧═══════════════════╧═══════════════════╧═══════════════════╧═══════════════════╝

Creating a DataFrame and specifying index, dtypes, columns

You can create a DataFrame and specify options like index, column names, dtypes as well as configuration options like display, memory mode etc.

Note: Specifing dtypes, column names and index on DataFrame creation makes the process slightly faster.

import { DataFrame } from "danfojs"

let data1 = [[1, 2.3, 3, 4, 5, "girl"], [30, 40.1, 39, 89, 78, "boy"]];
let index = ["a", "b"];
let columns = ["col1", "col2", "col3", "col4", "col5", "col6"]
let dtypes = ["int32", "float32", "int32", "int32", "int32", "string"]

let df = new DataFrame(data1, { index, columns, dtypes });
df.print()
╔════════════╤═══════════════════╤═══════════════════╤═══════════════════╤═══════════════════╤═══════════════════╤═══════════════════╗
║            │ col1              │ col2              │ col3              │ col4              │ col5              │ col6              ║
╟────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ a          │ 1                 │ 2.3               │ 3                 │ 4                 │ 5                 │ girl              ║
╟────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────┼───────────────────╢
║ b          │ 30                │ 40.1              │ 39                │ 89                │ 78                │ boy               ║
╚════════════╧═══════════════════╧═══════════════════╧═══════════════════╧═══════════════════╧═══════════════════╧═══════════════════╝

Creating a DataFrame and specifying memory mode

To use less space on DataFrame creation, you can set the low memory mode as demonstrated below:

import { DataFrame } from "danfojs"

let data1 = [[1, 2.3, 3, 4, 5, "girl"], [30, 40.1, 39, 89, 78, "boy"]];

let df = new DataFrame(data1, {
    config: { lowMemoryMode: true }
});
df.print()

Note: In low memory mode, less space is used by the DataFrame. The drawback is that some operations especially the ones involving column data become slightly slower.

PreviousDataframeNextDataFrame.sortIndex

Last updated 2 months ago

Was this helpful?

For loading flat files like CSV, EXCEL and, JSON into DataFrames, see this

page