{
"cells": [
{
"cell_type": "markdown",
"source": [
"# Manipulating columns of a DataFrame\n",
"## Renaming columns\n",
"Let's start with a DataFrame of Bools that has default column names."
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"using DataFrames"
],
"metadata": {},
"execution_count": 1
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\n─────┼────────────────────────────\n 1 │ true false false false\n 2 │ true true true true\n 3 │ false true true true",
"text/html": [
"
1 | true | false | false | false |
2 | true | true | true | true |
3 | false | true | true | true |
"
]
},
"metadata": {},
"execution_count": 2
}
],
"cell_type": "code",
"source": [
"x = DataFrame(rand(Bool, 3, 4), :auto)"
],
"metadata": {},
"execution_count": 2
},
{
"cell_type": "markdown",
"source": [
"With `rename()`, we create new DataFrame; here we rename the column `:x1` to `:A`. (`rename` also accepts collections of Pairs.)"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m A \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\n─────┼────────────────────────────\n 1 │ true false false false\n 2 │ true true true true\n 3 │ false true true true",
"text/html": [
"1 | true | false | false | false |
2 | true | true | true | true |
3 | false | true | true | true |
"
]
},
"metadata": {},
"execution_count": 3
}
],
"cell_type": "code",
"source": [
"rename(x, :x1 => :A)"
],
"metadata": {},
"execution_count": 3
},
{
"cell_type": "markdown",
"source": [
"With `rename!()` we do an in place transformation.\n",
"This time we've applied a function to every column name (note that the function gets a column names as a string)."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1x1 \u001b[0m\u001b[1m x2x2 \u001b[0m\u001b[1m x3x3 \u001b[0m\u001b[1m x4x4 \u001b[0m\n │\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\n─────┼────────────────────────────\n 1 │ true false false false\n 2 │ true true true true\n 3 │ false true true true",
"text/html": [
"1 | true | false | false | false |
2 | true | true | true | true |
3 | false | true | true | true |
"
]
},
"metadata": {},
"execution_count": 4
}
],
"cell_type": "code",
"source": [
"rename!(c -> c^2, x)"
],
"metadata": {},
"execution_count": 4
},
{
"cell_type": "markdown",
"source": [
"We can also change the name of a particular column without knowing the original.\n",
"Here we change the name of the third column, creating a new DataFrame."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1x1 \u001b[0m\u001b[1m x2x2 \u001b[0m\u001b[1m third \u001b[0m\u001b[1m x4x4 \u001b[0m\n │\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\n─────┼────────────────────────────\n 1 │ true false false false\n 2 │ true true true true\n 3 │ false true true true",
"text/html": [
"1 | true | false | false | false |
2 | true | true | true | true |
3 | false | true | true | true |
"
]
},
"metadata": {},
"execution_count": 5
}
],
"cell_type": "code",
"source": [
"rename(x, 3 => :third)"
],
"metadata": {},
"execution_count": 5
},
{
"cell_type": "markdown",
"source": [
"If we pass a vector of names to rename!, we can change the names of all variables."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\u001b[1m b \u001b[0m\u001b[1m c \u001b[0m\u001b[1m d \u001b[0m\n │\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\n─────┼────────────────────────────\n 1 │ true false false false\n 2 │ true true true true\n 3 │ false true true true",
"text/html": [
"1 | true | false | false | false |
2 | true | true | true | true |
3 | false | true | true | true |
"
]
},
"metadata": {},
"execution_count": 6
}
],
"cell_type": "code",
"source": [
"rename!(x, [:a, :b, :c, :d])"
],
"metadata": {},
"execution_count": 6
},
{
"cell_type": "markdown",
"source": [
"In all the above examples you could have used strings instead of symbols, for example,"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\u001b[1m b \u001b[0m\u001b[1m c \u001b[0m\u001b[1m d \u001b[0m\n │\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\n─────┼────────────────────────────\n 1 │ true false false false\n 2 │ true true true true\n 3 │ false true true true",
"text/html": [
"1 | true | false | false | false |
2 | true | true | true | true |
3 | false | true | true | true |
"
]
},
"metadata": {},
"execution_count": 7
}
],
"cell_type": "code",
"source": [
"rename!(x, string.('a':'d'))"
],
"metadata": {},
"execution_count": 7
},
{
"cell_type": "markdown",
"source": [
"rename! allows for circular renaming of columns:"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\u001b[1m b \u001b[0m\u001b[1m c \u001b[0m\u001b[1m d \u001b[0m\n │\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\n─────┼────────────────────────────\n 1 │ true false false false\n 2 │ true true true true\n 3 │ false true true true",
"text/html": [
"1 | true | false | false | false |
2 | true | true | true | true |
3 | false | true | true | true |
"
]
},
"metadata": {},
"execution_count": 8
}
],
"cell_type": "code",
"source": [
"x"
],
"metadata": {},
"execution_count": 8
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m d \u001b[0m\u001b[1m b \u001b[0m\u001b[1m c \u001b[0m\u001b[1m a \u001b[0m\n │\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\n─────┼────────────────────────────\n 1 │ true false false false\n 2 │ true true true true\n 3 │ false true true true",
"text/html": [
"1 | true | false | false | false |
2 | true | true | true | true |
3 | false | true | true | true |
"
]
},
"metadata": {},
"execution_count": 9
}
],
"cell_type": "code",
"source": [
"rename!(x, \"a\"=>\"d\", \"d\"=>\"a\")"
],
"metadata": {},
"execution_count": 9
},
{
"cell_type": "markdown",
"source": [
"We get an error when we try to provide duplicate names"
],
"metadata": {}
},
{
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ArgumentError(\"Duplicate variable names: :a. Pass makeunique=true to make them unique using a suffix automatically.\")"
]
}
],
"cell_type": "code",
"source": [
"try\n",
" rename(x, fill(:a, 4))\n",
"catch e\n",
" show(e)\n",
"end"
],
"metadata": {},
"execution_count": 10
},
{
"cell_type": "markdown",
"source": [
"unless we pass makeunique=true, which allows us to handle duplicates in passed names."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\u001b[1m a_1 \u001b[0m\u001b[1m a_2 \u001b[0m\u001b[1m a_3 \u001b[0m\n │\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\n─────┼────────────────────────────\n 1 │ true false false false\n 2 │ true true true true\n 3 │ false true true true",
"text/html": [
"1 | true | false | false | false |
2 | true | true | true | true |
3 | false | true | true | true |
"
]
},
"metadata": {},
"execution_count": 11
}
],
"cell_type": "code",
"source": [
"rename(x, fill(:a, 4), makeunique=true)"
],
"metadata": {},
"execution_count": 11
},
{
"cell_type": "markdown",
"source": [
"## Reordering columns\n",
"We can reorder the names(x) vector as needed, creating a new DataFrame."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m d \u001b[0m\u001b[1m b \u001b[0m\u001b[1m c \u001b[0m\u001b[1m a \u001b[0m\n │\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\n─────┼────────────────────────────\n 1 │ true false false false\n 2 │ true true true true\n 3 │ false true true true",
"text/html": [
"1 | true | false | false | false |
2 | true | true | true | true |
3 | false | true | true | true |
"
]
},
"metadata": {},
"execution_count": 12
}
],
"cell_type": "code",
"source": [
"using Random\n",
"Random.seed!(1234)\n",
"x[:, shuffle(names(x))]"
],
"metadata": {},
"execution_count": 12
},
{
"cell_type": "markdown",
"source": [
"Also select! can be used to achieve this in place (or select to perform a copy):"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m d \u001b[0m\u001b[1m b \u001b[0m\u001b[1m c \u001b[0m\u001b[1m a \u001b[0m\n │\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\n─────┼────────────────────────────\n 1 │ true false false false\n 2 │ true true true true\n 3 │ false true true true",
"text/html": [
"1 | true | false | false | false |
2 | true | true | true | true |
3 | false | true | true | true |
"
]
},
"metadata": {},
"execution_count": 13
}
],
"cell_type": "code",
"source": [
"x"
],
"metadata": {},
"execution_count": 13
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\u001b[1m c \u001b[0m\u001b[1m b \u001b[0m\u001b[1m d \u001b[0m\n │\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\u001b[90m Bool \u001b[0m\n─────┼────────────────────────────\n 1 │ false false false true\n 2 │ true true true true\n 3 │ true true true false",
"text/html": [
"1 | false | false | false | true |
2 | true | true | true | true |
3 | true | true | true | false |
"
]
},
"metadata": {},
"execution_count": 14
}
],
"cell_type": "code",
"source": [
"select!(x, 4:-1:1);\n",
"x"
],
"metadata": {},
"execution_count": 14
},
{
"cell_type": "markdown",
"source": [
"## Mrging/adding columns"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────\n 1 │ (1, 1) (1, 2) (1, 3) (1, 4)\n 2 │ (2, 1) (2, 2) (2, 3) (2, 4)\n 3 │ (3, 1) (3, 2) (3, 3) (3, 4)",
"text/html": [
"1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) |
2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) |
3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) |
"
]
},
"metadata": {},
"execution_count": 15
}
],
"cell_type": "code",
"source": [
"x = DataFrame([(i,j) for i in 1:3, j in 1:4], :auto)"
],
"metadata": {},
"execution_count": 15
},
{
"cell_type": "markdown",
"source": [
"With `hcat` we can merge two `DataFrames`. Also `[x y]` syntax is supported but only when DataFrames have unique column names."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×8 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x1_1 \u001b[0m\u001b[1m x2_1 \u001b[0m\u001b[1m x3_1 \u001b[0m\u001b[1m x4_1 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────────────────────────────────────\n 1 │ (1, 1) (1, 2) (1, 3) (1, 4) (1, 1) (1, 2) (1, 3) (1, 4)\n 2 │ (2, 1) (2, 2) (2, 3) (2, 4) (2, 1) (2, 2) (2, 3) (2, 4)\n 3 │ (3, 1) (3, 2) (3, 3) (3, 4) (3, 1) (3, 2) (3, 3) (3, 4)",
"text/html": [
"1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) | (1, 1) | (1, 2) | (1, 3) | (1, 4) |
2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) | (2, 1) | (2, 2) | (2, 3) | (2, 4) |
3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) | (3, 1) | (3, 2) | (3, 3) | (3, 4) |
"
]
},
"metadata": {},
"execution_count": 16
}
],
"cell_type": "code",
"source": [
"hcat(x, x, makeunique=true)"
],
"metadata": {},
"execution_count": 16
},
{
"cell_type": "markdown",
"source": [
"You can append a vector to a data frame with the following syntax:"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×5 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m A \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼───────────────────────────────────────\n 1 │ (1, 1) (1, 2) (1, 3) (1, 4) 1\n 2 │ (2, 1) (2, 2) (2, 3) (2, 4) 2\n 3 │ (3, 1) (3, 2) (3, 3) (3, 4) 3",
"text/html": [
"1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) | 1 |
2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) | 2 |
3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) | 3 |
"
]
},
"metadata": {},
"execution_count": 17
}
],
"cell_type": "code",
"source": [
"y = [x DataFrame(A=[1,2,3])]"
],
"metadata": {},
"execution_count": 17
},
{
"cell_type": "markdown",
"source": [
"Here we do the same but add column `:A` to the front."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×5 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m A \u001b[0m\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼───────────────────────────────────────\n 1 │ 1 (1, 1) (1, 2) (1, 3) (1, 4)\n 2 │ 2 (2, 1) (2, 2) (2, 3) (2, 4)\n 3 │ 3 (3, 1) (3, 2) (3, 3) (3, 4)",
"text/html": [
"1 | 1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) |
2 | 2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) |
3 | 3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) |
"
]
},
"metadata": {},
"execution_count": 18
}
],
"cell_type": "code",
"source": [
"y = [DataFrame(A=[1,2,3]) x]"
],
"metadata": {},
"execution_count": 18
},
{
"cell_type": "markdown",
"source": [
"A column can also be added in the middle. Here a brute-force method is used and a new `DataFrame` is created."
],
"metadata": {}
},
{
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 3.796 μs (74 allocations: 6.39 KiB)\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×5 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m A \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼───────────────────────────────────────\n 1 │ (1, 1) (1, 2) 1 (1, 3) (1, 4)\n 2 │ (2, 1) (2, 2) 2 (2, 3) (2, 4)\n 3 │ (3, 1) (3, 2) 3 (3, 3) (3, 4)",
"text/html": [
"1 | (1, 1) | (1, 2) | 1 | (1, 3) | (1, 4) |
2 | (2, 1) | (2, 2) | 2 | (2, 3) | (2, 4) |
3 | (3, 1) | (3, 2) | 3 | (3, 3) | (3, 4) |
"
]
},
"metadata": {},
"execution_count": 19
}
],
"cell_type": "code",
"source": [
"using BenchmarkTools\n",
"@btime [$x[!, 1:2] DataFrame(A=[1,2,3]) $x[!, 3:4]]"
],
"metadata": {},
"execution_count": 19
},
{
"cell_type": "markdown",
"source": [
"We could also do this with a specialized in place method `insertcols!`. Let's add `:newcol` to the `DataFrame` y."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×6 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m A \u001b[0m\u001b[1m newcol \u001b[0m\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼───────────────────────────────────────────────\n 1 │ 1 1 (1, 1) (1, 2) (1, 3) (1, 4)\n 2 │ 2 2 (2, 1) (2, 2) (2, 3) (2, 4)\n 3 │ 3 3 (3, 1) (3, 2) (3, 3) (3, 4)",
"text/html": [
"1 | 1 | 1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) |
2 | 2 | 2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) |
3 | 3 | 3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) |
"
]
},
"metadata": {},
"execution_count": 20
}
],
"cell_type": "code",
"source": [
"insertcols!(y, 2, \"newcol\" => [1,2,3])"
],
"metadata": {},
"execution_count": 20
},
{
"cell_type": "markdown",
"source": [
"If you want to insert the same column name several times `makeunique=true` is needed as usual."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×7 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m A \u001b[0m\u001b[1m newcol_1 \u001b[0m\u001b[1m newcol \u001b[0m\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼─────────────────────────────────────────────────────────\n 1 │ 1 1 1 (1, 1) (1, 2) (1, 3) (1, 4)\n 2 │ 2 2 2 (2, 1) (2, 2) (2, 3) (2, 4)\n 3 │ 3 3 3 (3, 1) (3, 2) (3, 3) (3, 4)",
"text/html": [
"1 | 1 | 1 | 1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) |
2 | 2 | 2 | 2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) |
3 | 3 | 3 | 3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) |
"
]
},
"metadata": {},
"execution_count": 21
}
],
"cell_type": "code",
"source": [
"insertcols!(y, 2, :newcol => [1,2,3], makeunique=true)"
],
"metadata": {},
"execution_count": 21
},
{
"cell_type": "markdown",
"source": [
"We can see how much faster it is to insert a column with `insertcols!` than with `hcat` using `@btime` (note that we use here a `Pair` notation as an example)."
],
"metadata": {}
},
{
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 896.060 ns (16 allocations: 1.50 KiB)\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×5 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m A \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼───────────────────────────────────────\n 1 │ (1, 1) (1, 2) 1 (1, 3) (1, 4)\n 2 │ (2, 1) (2, 2) 2 (2, 3) (2, 4)\n 3 │ (3, 1) (3, 2) 3 (3, 3) (3, 4)",
"text/html": [
"1 | (1, 1) | (1, 2) | 1 | (1, 3) | (1, 4) |
2 | (2, 1) | (2, 2) | 2 | (2, 3) | (2, 4) |
3 | (3, 1) | (3, 2) | 3 | (3, 3) | (3, 4) |
"
]
},
"metadata": {},
"execution_count": 22
}
],
"cell_type": "code",
"source": [
"@btime insertcols!(copy($x), 3, :A => [1,2,3])"
],
"metadata": {},
"execution_count": 22
},
{
"cell_type": "markdown",
"source": [
"Let's use `insertcols!` to append a column in place (note that we dropped the index at which we insert the column)"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×5 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m A \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼───────────────────────────────────────\n 1 │ (1, 1) (1, 2) (1, 3) (1, 4) 1\n 2 │ (2, 1) (2, 2) (2, 3) (2, 4) 2\n 3 │ (3, 1) (3, 2) (3, 3) (3, 4) 3",
"text/html": [
"1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) | 1 |
2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) | 2 |
3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) | 3 |
"
]
},
"metadata": {},
"execution_count": 23
}
],
"cell_type": "code",
"source": [
"insertcols!(x, :A => [1,2,3])"
],
"metadata": {},
"execution_count": 23
},
{
"cell_type": "markdown",
"source": [
"and to in place prepend a column."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×6 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m B \u001b[0m\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m A \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼──────────────────────────────────────────────\n 1 │ 1 (1, 1) (1, 2) (1, 3) (1, 4) 1\n 2 │ 2 (2, 1) (2, 2) (2, 3) (2, 4) 2\n 3 │ 3 (3, 1) (3, 2) (3, 3) (3, 4) 3",
"text/html": [
"1 | 1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) | 1 |
2 | 2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) | 2 |
3 | 3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) | 3 |
"
]
},
"metadata": {},
"execution_count": 24
}
],
"cell_type": "code",
"source": [
"insertcols!(x, 1, :B => [1,2,3])"
],
"metadata": {},
"execution_count": 24
},
{
"cell_type": "markdown",
"source": [
"Note that `insertcols!` can be used to insert several columns to a data frame at once and that it performs broadcasting if needed:"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×1 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\n │\u001b[90m Int64 \u001b[0m\n─────┼───────\n 1 │ 1\n 2 │ 2\n 3 │ 3",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 25
}
],
"cell_type": "code",
"source": [
"df = DataFrame(a = [1, 2, 3])"
],
"metadata": {},
"execution_count": 25
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\u001b[1m b \u001b[0m\u001b[1m c \u001b[0m\u001b[1m d \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m String \u001b[0m\u001b[90m Char \u001b[0m\u001b[90m Array… \u001b[0m\n─────┼────────────────────────────────\n 1 │ 1 x a [1, 2, 3]\n 2 │ 2 x b [1, 2, 3]\n 3 │ 3 x c [1, 2, 3]",
"text/html": [
"1 | 1 | x | a | [1, 2, 3] |
2 | 2 | x | b | [1, 2, 3] |
3 | 3 | x | c | [1, 2, 3] |
"
]
},
"metadata": {},
"execution_count": 26
}
],
"cell_type": "code",
"source": [
"insertcols!(df, :b => \"x\", :c => 'a':'c', :d => Ref([1,2,3]))"
],
"metadata": {},
"execution_count": 26
},
{
"cell_type": "markdown",
"source": [
"Interestingly we can emulate `hcat` mutating the data frame in-place using `insertcols!`:"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m2×1 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\n │\u001b[90m Int64 \u001b[0m\n─────┼───────\n 1 │ 1\n 2 │ 2",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 27
}
],
"cell_type": "code",
"source": [
"df1 = DataFrame(a=[1,2])"
],
"metadata": {},
"execution_count": 27
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m2×2 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m b \u001b[0m\u001b[1m c \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼──────────────\n 1 │ 2 3\n 2 │ 3 4",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 28
}
],
"cell_type": "code",
"source": [
"df2 = DataFrame(b=[2,3], c=[3,4])"
],
"metadata": {},
"execution_count": 28
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m2×3 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\u001b[1m b \u001b[0m\u001b[1m c \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼─────────────────────\n 1 │ 1 2 3\n 2 │ 2 3 4",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 29
}
],
"cell_type": "code",
"source": [
"hcat(df1, df2)"
],
"metadata": {},
"execution_count": 29
},
{
"cell_type": "markdown",
"source": [
"df1 is not touched"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m2×1 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\n │\u001b[90m Int64 \u001b[0m\n─────┼───────\n 1 │ 1\n 2 │ 2",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 30
}
],
"cell_type": "code",
"source": [
"df1"
],
"metadata": {},
"execution_count": 30
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m2×3 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\u001b[1m b \u001b[0m\u001b[1m c \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼─────────────────────\n 1 │ 1 2 3\n 2 │ 2 3 4",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 31
}
],
"cell_type": "code",
"source": [
"insertcols!(df1, pairs(eachcol(df2))...)"
],
"metadata": {},
"execution_count": 31
},
{
"cell_type": "markdown",
"source": [
"now we have changed df1"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m2×3 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m a \u001b[0m\u001b[1m b \u001b[0m\u001b[1m c \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼─────────────────────\n 1 │ 1 2 3\n 2 │ 2 3 4",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 32
}
],
"cell_type": "code",
"source": [
"df1"
],
"metadata": {},
"execution_count": 32
},
{
"cell_type": "markdown",
"source": [
"## Subsetting/removing columns\n",
"Let's create a new `DataFrame` `x` and show a few ways to create DataFrames with a subset of `x`'s columns."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×5 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────────────\n 1 │ (1, 1) (1, 2) (1, 3) (1, 4) (1, 5)\n 2 │ (2, 1) (2, 2) (2, 3) (2, 4) (2, 5)\n 3 │ (3, 1) (3, 2) (3, 3) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) | (1, 5) |
2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) | (2, 5) |
3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 33
}
],
"cell_type": "code",
"source": [
"x = DataFrame([(i,j) for i in 1:3, j in 1:5], :auto)"
],
"metadata": {},
"execution_count": 33
},
{
"cell_type": "markdown",
"source": [
"First we could do this by index:\n",
"You could use `!` instead of `:` for non-copying operation"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────\n 1 │ (1, 1) (1, 2) (1, 4) (1, 5)\n 2 │ (2, 1) (2, 2) (2, 4) (2, 5)\n 3 │ (3, 1) (3, 2) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 2) | (1, 4) | (1, 5) |
2 | (2, 1) | (2, 2) | (2, 4) | (2, 5) |
3 | (3, 1) | (3, 2) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 34
}
],
"cell_type": "code",
"source": [
"x[:, [1,2,4,5]]"
],
"metadata": {},
"execution_count": 34
},
{
"cell_type": "markdown",
"source": [
"or by column name:"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×2 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────\n 1 │ (1, 1) (1, 4)\n 2 │ (2, 1) (2, 4)\n 3 │ (3, 1) (3, 4)",
"text/html": [
"1 | (1, 1) | (1, 4) |
2 | (2, 1) | (2, 4) |
3 | (3, 1) | (3, 4) |
"
]
},
"metadata": {},
"execution_count": 35
}
],
"cell_type": "code",
"source": [
"x[:, [:x1, :x4]]"
],
"metadata": {},
"execution_count": 35
},
{
"cell_type": "markdown",
"source": [
"We can also choose to keep or exclude columns by `Bool` (we need a vector whose length is the number of columns in the original `DataFrame`)."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×3 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────\n 1 │ (1, 1) (1, 3) (1, 5)\n 2 │ (2, 1) (2, 3) (2, 5)\n 3 │ (3, 1) (3, 3) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 3) | (1, 5) |
2 | (2, 1) | (2, 3) | (2, 5) |
3 | (3, 1) | (3, 3) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 36
}
],
"cell_type": "code",
"source": [
"x[:, [true, false, true, false, true]]"
],
"metadata": {},
"execution_count": 36
},
{
"cell_type": "markdown",
"source": [
"Here we create a single column `DataFrame`,"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×1 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\n─────┼────────\n 1 │ (1, 1)\n 2 │ (2, 1)\n 3 │ (3, 1)",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 37
}
],
"cell_type": "code",
"source": [
"x[:, [:x1]]"
],
"metadata": {},
"execution_count": 37
},
{
"cell_type": "markdown",
"source": [
"and here we access the vector contained in column `:x1`."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "3-element Vector{Tuple{Int64, Int64}}:\n (1, 1)\n (2, 1)\n (3, 1)"
},
"metadata": {},
"execution_count": 38
}
],
"cell_type": "code",
"source": [
"x[!, :x1] ## use : instead of ! to copy"
],
"metadata": {},
"execution_count": 38
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "3-element Vector{Tuple{Int64, Int64}}:\n (1, 1)\n (2, 1)\n (3, 1)"
},
"metadata": {},
"execution_count": 39
}
],
"cell_type": "code",
"source": [
"x.x1 ## the same"
],
"metadata": {},
"execution_count": 39
},
{
"cell_type": "markdown",
"source": [
"We could grab the same vector by column number"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "3-element Vector{Tuple{Int64, Int64}}:\n (1, 1)\n (2, 1)\n (3, 1)"
},
"metadata": {},
"execution_count": 40
}
],
"cell_type": "code",
"source": [
"x[!, 1]"
],
"metadata": {},
"execution_count": 40
},
{
"cell_type": "markdown",
"source": [
"Note that getting a single column returns it without copying while creating a new `DataFrame` performs a copy of the column"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "false"
},
"metadata": {},
"execution_count": 41
}
],
"cell_type": "code",
"source": [
"x[!, 1] === x[!, [1]]"
],
"metadata": {},
"execution_count": 41
},
{
"cell_type": "markdown",
"source": [
"you can also use `Regex`, `All`, `Between` and `Not` from InvertedIndies.jl for column selection:"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×2 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────\n 1 │ (1, 1) (1, 2)\n 2 │ (2, 1) (2, 2)\n 3 │ (3, 1) (3, 2)",
"text/html": [
"1 | (1, 1) | (1, 2) |
2 | (2, 1) | (2, 2) |
3 | (3, 1) | (3, 2) |
"
]
},
"metadata": {},
"execution_count": 42
}
],
"cell_type": "code",
"source": [
"x[!, r\"[12]\"]"
],
"metadata": {},
"execution_count": 42
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────\n 1 │ (1, 2) (1, 3) (1, 4) (1, 5)\n 2 │ (2, 2) (2, 3) (2, 4) (2, 5)\n 3 │ (3, 2) (3, 3) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 2) | (1, 3) | (1, 4) | (1, 5) |
2 | (2, 2) | (2, 3) | (2, 4) | (2, 5) |
3 | (3, 2) | (3, 3) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 43
}
],
"cell_type": "code",
"source": [
"x[!, Not(1)]"
],
"metadata": {},
"execution_count": 43
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×3 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────\n 1 │ (1, 2) (1, 3) (1, 4)\n 2 │ (2, 2) (2, 3) (2, 4)\n 3 │ (3, 2) (3, 3) (3, 4)",
"text/html": [
"1 | (1, 2) | (1, 3) | (1, 4) |
2 | (2, 2) | (2, 3) | (2, 4) |
3 | (3, 2) | (3, 3) | (3, 4) |
"
]
},
"metadata": {},
"execution_count": 44
}
],
"cell_type": "code",
"source": [
"x[!, Between(:x2, :x4)]"
],
"metadata": {},
"execution_count": 44
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────\n 1 │ (1, 1) (1, 3) (1, 4) (1, 5)\n 2 │ (2, 1) (2, 3) (2, 4) (2, 5)\n 3 │ (3, 1) (3, 3) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 3) | (1, 4) | (1, 5) |
2 | (2, 1) | (2, 3) | (2, 4) | (2, 5) |
3 | (3, 1) | (3, 3) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 45
}
],
"cell_type": "code",
"source": [
"x[!, Cols(:x1, Between(:x3, :x5))]"
],
"metadata": {},
"execution_count": 45
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────\n 1 │ (1, 1) (1, 3) (1, 4) (1, 5)\n 2 │ (2, 1) (2, 3) (2, 4) (2, 5)\n 3 │ (3, 1) (3, 3) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 3) | (1, 4) | (1, 5) |
2 | (2, 1) | (2, 3) | (2, 4) | (2, 5) |
3 | (3, 1) | (3, 3) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 46
}
],
"cell_type": "code",
"source": [
"select(x, :x1, Between(:x3, :x5), copycols=false) ## the same as above"
],
"metadata": {},
"execution_count": 46
},
{
"cell_type": "markdown",
"source": [
"you can use `select` and `select!` functions to select a subset of columns from a data frame. `select` creates a new data frame and `select!` operates in place"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×5 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────────────\n 1 │ (1, 1) (1, 2) (1, 3) (1, 4) (1, 5)\n 2 │ (2, 1) (2, 2) (2, 3) (2, 4) (2, 5)\n 3 │ (3, 1) (3, 2) (3, 3) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) | (1, 5) |
2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) | (2, 5) |
3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 47
}
],
"cell_type": "code",
"source": [
"df = copy(x)"
],
"metadata": {},
"execution_count": 47
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×2 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────\n 1 │ (1, 1) (1, 2)\n 2 │ (2, 1) (2, 2)\n 3 │ (3, 1) (3, 2)",
"text/html": [
"1 | (1, 1) | (1, 2) |
2 | (2, 1) | (2, 2) |
3 | (3, 1) | (3, 2) |
"
]
},
"metadata": {},
"execution_count": 48
}
],
"cell_type": "code",
"source": [
"df2 = select(df, [1, 2])"
],
"metadata": {},
"execution_count": 48
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×3 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────\n 1 │ (1, 3) (1, 4) (1, 5)\n 2 │ (2, 3) (2, 4) (2, 5)\n 3 │ (3, 3) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 3) | (1, 4) | (1, 5) |
2 | (2, 3) | (2, 4) | (2, 5) |
3 | (3, 3) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 49
}
],
"cell_type": "code",
"source": [
"select(df, Not([1, 2]))"
],
"metadata": {},
"execution_count": 49
},
{
"cell_type": "markdown",
"source": [
"by default `select` copies columns"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "false"
},
"metadata": {},
"execution_count": 50
}
],
"cell_type": "code",
"source": [
"df2[!, 1] === df[!, 1]"
],
"metadata": {},
"execution_count": 50
},
{
"cell_type": "markdown",
"source": [
"which can be avoided by giving `copycols=false`"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×2 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────\n 1 │ (1, 1) (1, 2)\n 2 │ (2, 1) (2, 2)\n 3 │ (3, 1) (3, 2)",
"text/html": [
"1 | (1, 1) | (1, 2) |
2 | (2, 1) | (2, 2) |
3 | (3, 1) | (3, 2) |
"
]
},
"metadata": {},
"execution_count": 51
}
],
"cell_type": "code",
"source": [
"df2 = select(df, [1, 2], copycols=false)"
],
"metadata": {},
"execution_count": 51
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "true"
},
"metadata": {},
"execution_count": 52
}
],
"cell_type": "code",
"source": [
"df2[!, 1] === df[!, 1]"
],
"metadata": {},
"execution_count": 52
},
{
"cell_type": "markdown",
"source": [
"using `select!` will modify the source data frame"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×2 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────\n 1 │ (1, 1) (1, 2)\n 2 │ (2, 1) (2, 2)\n 3 │ (3, 1) (3, 2)",
"text/html": [
"1 | (1, 1) | (1, 2) |
2 | (2, 1) | (2, 2) |
3 | (3, 1) | (3, 2) |
"
]
},
"metadata": {},
"execution_count": 53
}
],
"cell_type": "code",
"source": [
"select!(df, [1,2])"
],
"metadata": {},
"execution_count": 53
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "true"
},
"metadata": {},
"execution_count": 54
}
],
"cell_type": "code",
"source": [
"df == df2"
],
"metadata": {},
"execution_count": 54
},
{
"cell_type": "markdown",
"source": [
"Here we create a copy of `x` and delete the 3rd column from the copy with `select!` and `Not`"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────\n 1 │ (1, 1) (1, 2) (1, 4) (1, 5)\n 2 │ (2, 1) (2, 2) (2, 4) (2, 5)\n 3 │ (3, 1) (3, 2) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 2) | (1, 4) | (1, 5) |
2 | (2, 1) | (2, 2) | (2, 4) | (2, 5) |
3 | (3, 1) | (3, 2) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 55
}
],
"cell_type": "code",
"source": [
"z = copy(x)\n",
"select!(z, Not(3))"
],
"metadata": {},
"execution_count": 55
},
{
"cell_type": "markdown",
"source": [
"alternatively we can achieve the same by using the `select` function"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────\n 1 │ (1, 1) (1, 2) (1, 4) (1, 5)\n 2 │ (2, 1) (2, 2) (2, 4) (2, 5)\n 3 │ (3, 1) (3, 2) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 2) | (1, 4) | (1, 5) |
2 | (2, 1) | (2, 2) | (2, 4) | (2, 5) |
3 | (3, 1) | (3, 2) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 56
}
],
"cell_type": "code",
"source": [
"select(x, Not(3))"
],
"metadata": {},
"execution_count": 56
},
{
"cell_type": "markdown",
"source": [
"`x` stays unchanged"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×5 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────────────\n 1 │ (1, 1) (1, 2) (1, 3) (1, 4) (1, 5)\n 2 │ (2, 1) (2, 2) (2, 3) (2, 4) (2, 5)\n 3 │ (3, 1) (3, 2) (3, 3) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) | (1, 5) |
2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) | (2, 5) |
3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 57
}
],
"cell_type": "code",
"source": [
"x"
],
"metadata": {},
"execution_count": 57
},
{
"cell_type": "markdown",
"source": [
"## Views\n",
"Note, that you can also create a view of a DataFrame when we want a subset of its columns:"
],
"metadata": {}
},
{
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 909.273 ns (18 allocations: 1.67 KiB)\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×3 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────\n 1 │ (1, 1) (1, 3) (1, 5)\n 2 │ (2, 1) (2, 3) (2, 5)\n 3 │ (3, 1) (3, 3) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 3) | (1, 5) |
2 | (2, 1) | (2, 3) | (2, 5) |
3 | (3, 1) | (3, 3) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 58
}
],
"cell_type": "code",
"source": [
"@btime x[:, [1,3,5]]"
],
"metadata": {},
"execution_count": 58
},
{
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 274.932 ns (3 allocations: 240 bytes)\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×3 SubDataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────\n 1 │ (1, 1) (1, 3) (1, 5)\n 2 │ (2, 1) (2, 3) (2, 5)\n 3 │ (3, 1) (3, 3) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 3) | (1, 5) |
2 | (2, 1) | (2, 3) | (2, 5) |
3 | (3, 1) | (3, 3) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 59
}
],
"cell_type": "code",
"source": [
"@btime @view x[:, [1,3,5]]"
],
"metadata": {},
"execution_count": 59
},
{
"cell_type": "markdown",
"source": [
"(Right now creation of the `view` is slow, but in the coming releases of the `DataFrames.jl` package it will become significantly faster)"
],
"metadata": {}
},
{
"cell_type": "markdown",
"source": [
"## Modify column by name"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×5 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────────────\n 1 │ (1, 1) (1, 2) (1, 3) (1, 4) (1, 5)\n 2 │ (2, 1) (2, 2) (2, 3) (2, 4) (2, 5)\n 3 │ (3, 1) (3, 2) (3, 3) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) | (1, 5) |
2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) | (2, 5) |
3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 60
}
],
"cell_type": "code",
"source": [
"x = DataFrame([(i,j) for i in 1:3, j in 1:5], :auto)"
],
"metadata": {},
"execution_count": 60
},
{
"cell_type": "markdown",
"source": [
"With the following syntax, the existing column is modified without performing any copying (this is discouraged as it creates column alias)."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×5 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────────────\n 1 │ (1, 2) (1, 2) (1, 3) (1, 4) (1, 5)\n 2 │ (2, 2) (2, 2) (2, 3) (2, 4) (2, 5)\n 3 │ (3, 2) (3, 2) (3, 3) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 2) | (1, 2) | (1, 3) | (1, 4) | (1, 5) |
2 | (2, 2) | (2, 2) | (2, 3) | (2, 4) | (2, 5) |
3 | (3, 2) | (3, 2) | (3, 3) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 61
}
],
"cell_type": "code",
"source": [
"x[!, :x1] = x[!, :x2]\n",
"x"
],
"metadata": {},
"execution_count": 61
},
{
"cell_type": "markdown",
"source": [
"this syntax is safer since it performs copy"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "3-element Vector{Tuple{Int64, Int64}}:\n (1, 2)\n (2, 2)\n (3, 2)"
},
"metadata": {},
"execution_count": 62
}
],
"cell_type": "code",
"source": [
"x[!, :x1] = x[:, :x2]"
],
"metadata": {},
"execution_count": 62
},
{
"cell_type": "markdown",
"source": [
"We can also use the following syntax to add a new column at the end of a `DataFrame`."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×6 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\u001b[1m A \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼───────────────────────────────────────────────\n 1 │ (1, 2) (1, 2) (1, 3) (1, 4) (1, 5) 1\n 2 │ (2, 2) (2, 2) (2, 3) (2, 4) (2, 5) 2\n 3 │ (3, 2) (3, 2) (3, 3) (3, 4) (3, 5) 3",
"text/html": [
"1 | (1, 2) | (1, 2) | (1, 3) | (1, 4) | (1, 5) | 1 |
2 | (2, 2) | (2, 2) | (2, 3) | (2, 4) | (2, 5) | 2 |
3 | (3, 2) | (3, 2) | (3, 3) | (3, 4) | (3, 5) | 3 |
"
]
},
"metadata": {},
"execution_count": 63
}
],
"cell_type": "code",
"source": [
"x[!, :A] = [1,2,3]\n",
"x"
],
"metadata": {},
"execution_count": 63
},
{
"cell_type": "markdown",
"source": [
"A new column name will be added to our `DataFrame` with the following syntax as well:"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×7 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\u001b[1m A \u001b[0m\u001b[1m B \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼──────────────────────────────────────────────────────\n 1 │ (1, 2) (1, 2) (1, 3) (1, 4) (1, 5) 1 11\n 2 │ (2, 2) (2, 2) (2, 3) (2, 4) (2, 5) 2 12\n 3 │ (3, 2) (3, 2) (3, 3) (3, 4) (3, 5) 3 13",
"text/html": [
"1 | (1, 2) | (1, 2) | (1, 3) | (1, 4) | (1, 5) | 1 | 11 |
2 | (2, 2) | (2, 2) | (2, 3) | (2, 4) | (2, 5) | 2 | 12 |
3 | (3, 2) | (3, 2) | (3, 3) | (3, 4) | (3, 5) | 3 | 13 |
"
]
},
"metadata": {},
"execution_count": 64
}
],
"cell_type": "code",
"source": [
"x.B = 11:13\n",
"x"
],
"metadata": {},
"execution_count": 64
},
{
"cell_type": "markdown",
"source": [
"## Find column name"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×5 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\n │\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\u001b[90m Tuple… \u001b[0m\n─────┼────────────────────────────────────────\n 1 │ (1, 1) (1, 2) (1, 3) (1, 4) (1, 5)\n 2 │ (2, 1) (2, 2) (2, 3) (2, 4) (2, 5)\n 3 │ (3, 1) (3, 2) (3, 3) (3, 4) (3, 5)",
"text/html": [
"1 | (1, 1) | (1, 2) | (1, 3) | (1, 4) | (1, 5) |
2 | (2, 1) | (2, 2) | (2, 3) | (2, 4) | (2, 5) |
3 | (3, 1) | (3, 2) | (3, 3) | (3, 4) | (3, 5) |
"
]
},
"metadata": {},
"execution_count": 65
}
],
"cell_type": "code",
"source": [
"x = DataFrame([(i,j) for i in 1:3, j in 1:5], :auto)"
],
"metadata": {},
"execution_count": 65
},
{
"cell_type": "markdown",
"source": [
"We can check if a column with a given name exists via `hasproperty`"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "true"
},
"metadata": {},
"execution_count": 66
}
],
"cell_type": "code",
"source": [
"hasproperty(x, :x1)"
],
"metadata": {},
"execution_count": 66
},
{
"cell_type": "markdown",
"source": [
"and determine its index via `columnindex`"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "2"
},
"metadata": {},
"execution_count": 67
}
],
"cell_type": "code",
"source": [
"columnindex(x, :x2)"
],
"metadata": {},
"execution_count": 67
},
{
"cell_type": "markdown",
"source": [
"## Advanced ways of column selection\n",
"these are most useful for non-standard column names (e.g. containing spaces)"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×2 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m column 2 \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼─────────────────\n 1 │ 1 4\n 2 │ 2 5\n 3 │ 3 6",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 68
}
],
"cell_type": "code",
"source": [
"df = DataFrame()\n",
"df.x1 = 1:3\n",
"df[!, \"column 2\"] = 4:6\n",
"df"
],
"metadata": {},
"execution_count": 68
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "3-element Vector{Int64}:\n 4\n 5\n 6"
},
"metadata": {},
"execution_count": 69
}
],
"cell_type": "code",
"source": [
"df.\"column 2\""
],
"metadata": {},
"execution_count": 69
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "3-element Vector{Int64}:\n 4\n 5\n 6"
},
"metadata": {},
"execution_count": 70
}
],
"cell_type": "code",
"source": [
"df[:, \"column 2\"]"
],
"metadata": {},
"execution_count": 70
},
{
"cell_type": "markdown",
"source": [
"or you can interpolate column names using the `:()` syntax"
],
"metadata": {}
},
{
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"x1\n",
"[1, 2, 3]\n",
"\n",
"column 2\n",
"[4, 5, 6]\n",
"\n"
]
}
],
"cell_type": "code",
"source": [
"for n in names(df)\n",
" println(n, \"\\n\", df.:($n), \"\\n\")\n",
"end"
],
"metadata": {},
"execution_count": 71
},
{
"cell_type": "markdown",
"source": [
"## Working on a collection of columns\n",
"When using `eachcol` of a data frame the resulting object retains reference to its parent and e.g. can be queried with `getproperty`"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼────────────────────────────\n 1 │ 1 4 7 10\n 2 │ 2 5 8 11\n 3 │ 3 6 9 12",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 72
}
],
"cell_type": "code",
"source": [
"df = DataFrame(reshape(1:12, 3, 4), :auto)"
],
"metadata": {},
"execution_count": 72
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrameColumns\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼────────────────────────────\n 1 │ 1 4 7 10\n 2 │ 2 5 8 11\n 3 │ 3 6 9 12",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 73
}
],
"cell_type": "code",
"source": [
"ec_df = eachcol(df)"
],
"metadata": {},
"execution_count": 73
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "3-element Vector{Int64}:\n 1\n 2\n 3"
},
"metadata": {},
"execution_count": 74
}
],
"cell_type": "code",
"source": [
"ec_df[1]"
],
"metadata": {},
"execution_count": 74
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "3-element Vector{Int64}:\n 1\n 2\n 3"
},
"metadata": {},
"execution_count": 75
}
],
"cell_type": "code",
"source": [
"ec_df.x1"
],
"metadata": {},
"execution_count": 75
},
{
"cell_type": "markdown",
"source": [
"## Transforming columns\n",
"\n",
"We will get to this subject later in 10_transforms.ipynb notebook, but here let us just note that `select`, `select!`, `transform`, `transform!` and `combine` functions allow to generate new columns based on the old columns of a data frame.\n",
"\n",
"The general rules are the following:\n",
"* `select` and `transform` always return the number of rows equal to the source data frame, while `combine` returns any number of rows (`combine` is allowed to *combine* rows of the source data frame)\n",
"* `transform` retains columns from the old data frame\n",
"* `select!` and `transform!` are in-place versions of `select` and `transform`"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×4 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼────────────────────────────\n 1 │ 1 4 7 10\n 2 │ 2 5 8 11\n 3 │ 3 6 9 12",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 76
}
],
"cell_type": "code",
"source": [
"df = DataFrame(reshape(1:12, 3, 4), :auto)"
],
"metadata": {},
"execution_count": 76
},
{
"cell_type": "markdown",
"source": [
"Here we add a new column `:res` that is a sum of columns `:x1` and `:x2`. A general syntax of transformations of this kind is:\n",
"\n",
"```\n",
"source_columns => function_to_apply => target_column_name\n",
"```\n",
"then `function_to_apply` gets columns selected by `source_columns` as positional arguments."
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×5 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m res \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼───────────────────────────────────\n 1 │ 1 4 7 10 5\n 2 │ 2 5 8 11 7\n 3 │ 3 6 9 12 9",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 77
}
],
"cell_type": "code",
"source": [
"transform(df, [:x1, :x2] => (+) => :res)"
],
"metadata": {},
"execution_count": 77
},
{
"cell_type": "markdown",
"source": [
"One can omit passing target_column_name in which case it is automatically generated:"
],
"metadata": {}
},
{
"outputs": [],
"cell_type": "code",
"source": [
"using Statistics"
],
"metadata": {},
"execution_count": 78
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m1×1 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1_x2_cor \u001b[0m\n │\u001b[90m Float64 \u001b[0m\n─────┼───────────\n 1 │ 1.0",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 79
}
],
"cell_type": "code",
"source": [
"combine(df, [:x1, :x2] => cor)"
],
"metadata": {},
"execution_count": 79
},
{
"cell_type": "markdown",
"source": [
"Note that `combine` allowed the number of columns in the resulting data frame to be changed. If we used `select` instead it would automatically broadcast the return value to match the number of rows of the source:"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×1 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1_x2_cor \u001b[0m\n │\u001b[90m Float64 \u001b[0m\n─────┼───────────\n 1 │ 1.0\n 2 │ 1.0\n 3 │ 1.0",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 80
}
],
"cell_type": "code",
"source": [
"select(df, [:x1, :x2] => cor)"
],
"metadata": {},
"execution_count": 80
},
{
"cell_type": "markdown",
"source": [
"If you want to apply some function on each row of the source wrap it in `ByRow`:"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×3 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x1_x2_string \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m String \u001b[0m\n─────┼────────────────────────────\n 1 │ 1 4 14\n 2 │ 2 5 25\n 3 │ 3 6 36",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 81
}
],
"cell_type": "code",
"source": [
"select(df, :x1, :x2, [:x1, :x2] => ByRow(string))"
],
"metadata": {},
"execution_count": 81
},
{
"cell_type": "markdown",
"source": [
"Also if you want columns to be passed as a `NamedTuple` to a function (instead of being positional arguments) wrap them in `AsTable`:"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×3 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x1_x2_function \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼──────────────────────────────\n 1 │ 1 4 5\n 2 │ 2 5 7\n 3 │ 3 6 9",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 82
}
],
"cell_type": "code",
"source": [
"select(df, :x1, :x2, AsTable([:x1, :x2]) => x -> x.x1 + x.x2)"
],
"metadata": {},
"execution_count": 82
},
{
"cell_type": "markdown",
"source": [
"For simplicity (as this functionality is often needed) there is a special treatment of `nrow` function that can be just passed as a transformation (without specifying of column selector):"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×2 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m number_of_rows \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼───────────────────────\n 1 │ 1 3\n 2 │ 2 3\n 3 │ 3 3",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 83
}
],
"cell_type": "code",
"source": [
"select(df, :x1, nrow => \"number_of_rows\")"
],
"metadata": {},
"execution_count": 83
},
{
"cell_type": "markdown",
"source": [
"(note that in `select` the number of rows is automatically broadcasted to match the number of rows of the source data frame)\n",
"Finally you can conveniently create multiple columns with one function, :"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×3 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x1² \u001b[0m\u001b[1m x1³ \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼─────────────────────\n 1 │ 1 1 1\n 2 │ 2 4 8\n 3 │ 3 9 27",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 84
}
],
"cell_type": "code",
"source": [
"select(df, :x1, :x1 => ByRow(x -> [x ^ 2, x ^ 3]) => [\"x1²\", \"x1³\"])"
],
"metadata": {},
"execution_count": 84
},
{
"cell_type": "markdown",
"source": [
"or (this produces the same result)"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m3×3 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x1² \u001b[0m\u001b[1m x1³ \u001b[0m\n │\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\u001b[90m Int64 \u001b[0m\n─────┼─────────────────────\n 1 │ 1 1 1\n 2 │ 2 4 8\n 3 │ 3 9 27",
"text/html": [
""
]
},
"metadata": {},
"execution_count": 85
}
],
"cell_type": "code",
"source": [
"select(df, :x1, :x1 => (x -> DataFrame(\"x1²\" => x .^ 2, \"x1³\" => x .^ 3)) => AsTable)"
],
"metadata": {},
"execution_count": 85
},
{
"cell_type": "markdown",
"source": [
"Note that since `DataFrames.jl` row aggregation for wide tables is efficient. Here is an example of a wide table with `sum` (other standard reductions are similarly supported):"
],
"metadata": {}
},
{
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m1000×10000 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m x1 \u001b[0m\u001b[1m x2 \u001b[0m\u001b[1m x3 \u001b[0m\u001b[1m x4 \u001b[0m\u001b[1m x5 \u001b[0m\u001b[1m x6 \u001b[0m\u001b[1m x7 \u001b[0m ⋯\n │\u001b[90m Float64 \u001b[0m\u001b[90m Float64 \u001b[0m\u001b[90m Float64 \u001b[0m\u001b[90m Float64 \u001b[0m\u001b[90m Float64 \u001b[0m\u001b[90m Float64 \u001b[0m\u001b[90m Float\u001b[0m ⋯\n──────┼─────────────────────────────────────────────────────────────────────────\n 1 │ 0.0149088 0.177963 0.374975 0.694063 0.290585 0.296972 0.151 ⋯\n 2 │ 0.520355 0.670122 0.387857 0.724383 0.0800228 0.248548 0.911\n 3 │ 0.639562 0.0423618 0.779594 0.130453 0.173812 0.525416 0.324\n 4 │ 0.839622 0.740699 0.360969 0.456934 0.398093 0.509447 0.335\n 5 │ 0.967143 0.302672 0.706902 0.0653216 0.644417 0.976457 0.368 ⋯\n 6 │ 0.205168 0.683128 0.786909 0.819635 0.759663 0.588019 0.244\n 7 │ 0.527184 0.22961 0.395737 0.712888 0.286281 0.0784137 0.214\n 8 │ 0.951162 0.963781 0.176482 0.751149 0.394366 0.945845 0.627\n ⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱\n 994 │ 0.424295 0.78012 0.577681 0.274228 0.225563 0.29553 0.395 ⋯\n 995 │ 0.481554 0.0441882 0.700181 0.317677 0.0846761 0.949135 0.457\n 996 │ 0.401528 0.658903 0.438309 0.211217 0.211515 0.441902 0.028\n 997 │ 0.58985 0.909857 0.354735 0.442829 0.438322 0.847835 0.843\n 998 │ 0.871944 0.821183 0.725636 0.521645 0.230773 0.506747 0.472 ⋯\n 999 │ 0.702069 0.399582 0.377331 0.542735 0.38631 0.555332 0.742\n 1000 │ 0.70703 0.166208 0.45823 0.0682235 0.458422 0.836939 0.582\n\u001b[36m 9994 columns and 985 rows omitted\u001b[0m",
"text/html": [
"1000×10000 DataFrame
9900 columns and 975 rows omitted
1 | 0.0149088 | 0.177963 | 0.374975 | 0.694063 | 0.290585 | 0.296972 | 0.151378 | 0.0895086 | 0.160365 | 0.34983 | 0.620936 | 0.523452 | 0.338873 | 0.345876 | 0.0554305 | 0.318905 | 0.731403 | 0.0512964 | 0.410298 | 0.704251 | 0.764749 | 0.433943 | 0.285334 | 0.272852 | 0.347178 | 0.418385 | 0.244038 | 0.80504 | 0.439898 | 0.749506 | 0.786246 | 0.900965 | 0.295996 | 0.746868 | 0.422829 | 0.234034 | 0.577203 | 0.146846 | 0.793704 | 0.487894 | 0.631279 | 0.67129 | 0.106474 | 0.528607 | 0.307837 | 0.502022 | 0.0295761 | 0.292053 | 0.429436 | 0.565343 | 0.432389 | 0.228183 | 0.490389 | 0.178154 | 0.596335 | 0.896194 | 0.874415 | 0.632939 | 0.76408 | 0.591913 | 0.810214 | 0.547853 | 0.0568375 | 0.608492 | 0.939105 | 0.858401 | 0.54132 | 0.52386 | 0.0726798 | 0.0929697 | 0.735107 | 0.693431 | 0.35952 | 0.303867 | 0.159923 | 0.0273659 | 0.580728 | 0.146714 | 0.516603 | 0.192205 | 0.773247 | 0.463275 | 0.692037 | 0.721464 | 0.350054 | 0.708989 | 0.0366043 | 0.732236 | 0.849192 | 0.379813 | 0.389003 | 0.00313487 | 0.577076 | 0.644305 | 0.848968 | 0.901229 | 0.0207003 | 0.580229 | 0.113557 | 0.426518 | ⋯ |
2 | 0.520355 | 0.670122 | 0.387857 | 0.724383 | 0.0800228 | 0.248548 | 0.911363 | 0.36889 | 0.529752 | 0.910771 | 0.997118 | 0.25405 | 0.653331 | 0.778802 | 0.208101 | 0.147263 | 0.724127 | 0.466017 | 0.315177 | 0.108208 | 0.913713 | 0.287405 | 0.243088 | 0.880804 | 0.688204 | 0.255706 | 0.962344 | 0.470052 | 0.000956712 | 0.914209 | 0.793158 | 0.282226 | 0.212236 | 0.347833 | 0.662226 | 0.411696 | 0.906173 | 0.0648159 | 0.382069 | 0.174568 | 0.852559 | 0.861592 | 0.918437 | 0.375104 | 0.343001 | 0.558225 | 0.683841 | 0.23744 | 0.24166 | 0.329452 | 0.828058 | 0.160971 | 0.812828 | 0.885788 | 0.767927 | 0.648915 | 0.93628 | 0.255611 | 0.382833 | 0.832244 | 0.703672 | 0.983996 | 0.280333 | 0.541443 | 0.801407 | 0.44169 | 0.932841 | 0.798906 | 0.425353 | 0.672373 | 0.949815 | 0.263668 | 0.0377062 | 0.558003 | 0.297895 | 0.526442 | 0.698634 | 0.264102 | 0.83779 | 0.275102 | 0.0506671 | 0.122803 | 0.768489 | 0.792194 | 0.612094 | 0.301353 | 0.771368 | 0.5832 | 0.491879 | 0.0327138 | 0.897539 | 0.0459418 | 0.808951 | 0.15295 | 0.271582 | 0.328406 | 0.811824 | 0.0813524 | 0.00851212 | 0.407592 | ⋯ |
3 | 0.639562 | 0.0423618 | 0.779594 | 0.130453 | 0.173812 | 0.525416 | 0.324965 | 0.905649 | 0.0863895 | 0.504512 | 0.831362 | 0.609332 | 0.362646 | 0.296489 | 0.124294 | 0.445428 | 0.383877 | 0.701939 | 0.899178 | 0.802026 | 0.596313 | 0.550465 | 0.0130267 | 0.305685 | 0.242941 | 0.976732 | 0.855757 | 0.185339 | 0.492588 | 0.886498 | 0.0670387 | 0.224632 | 0.973181 | 0.700497 | 0.927605 | 0.244754 | 0.552813 | 0.232683 | 0.535975 | 0.756654 | 0.663562 | 0.417351 | 0.610787 | 0.32249 | 0.977989 | 0.959334 | 0.825686 | 0.933289 | 0.15489 | 0.427586 | 0.921159 | 0.0408699 | 0.178735 | 0.182879 | 0.299849 | 0.646616 | 0.163978 | 0.685145 | 0.897406 | 0.689855 | 0.430954 | 0.695262 | 0.00489971 | 0.684737 | 0.112393 | 0.620627 | 0.329097 | 0.568049 | 0.53415 | 0.368477 | 0.713737 | 0.791697 | 0.0436433 | 0.356715 | 0.29434 | 0.8489 | 0.73944 | 0.577009 | 0.369629 | 0.0308708 | 0.0632458 | 0.668124 | 0.481055 | 0.629309 | 0.93324 | 0.107562 | 0.982491 | 0.453432 | 0.281526 | 0.372878 | 0.837115 | 0.196414 | 0.776446 | 0.191924 | 0.256692 | 0.0757817 | 0.274321 | 0.84296 | 0.899023 | 0.00600951 | ⋯ |
4 | 0.839622 | 0.740699 | 0.360969 | 0.456934 | 0.398093 | 0.509447 | 0.335432 | 0.373952 | 0.398662 | 0.851182 | 0.402862 | 0.827348 | 0.222065 | 0.646395 | 0.139992 | 0.720584 | 0.0681818 | 0.452283 | 0.655416 | 0.919854 | 0.634287 | 0.552484 | 0.39818 | 0.722039 | 0.8756 | 0.448769 | 0.554755 | 0.613562 | 0.710213 | 0.00465215 | 0.963763 | 0.111464 | 0.579245 | 0.0783318 | 0.811602 | 0.0586896 | 0.121353 | 0.451426 | 0.416095 | 0.167802 | 0.296337 | 0.68025 | 0.865083 | 0.455578 | 0.984931 | 0.699277 | 0.113661 | 0.0172499 | 0.389797 | 0.731438 | 0.179854 | 0.181849 | 0.64722 | 0.73755 | 0.0387112 | 0.163176 | 0.5831 | 0.618315 | 0.858166 | 0.487437 | 0.649856 | 0.510468 | 0.265383 | 0.716345 | 0.858997 | 0.421035 | 0.71227 | 0.210916 | 0.0728364 | 0.115068 | 0.923333 | 0.872 | 0.394273 | 0.575864 | 0.947865 | 0.511522 | 0.00605284 | 0.818524 | 0.226928 | 0.949736 | 0.757384 | 0.66079 | 0.827428 | 0.568193 | 0.00421599 | 0.12546 | 0.585029 | 0.31564 | 0.981435 | 0.229752 | 0.826171 | 0.133524 | 0.286998 | 0.544742 | 0.70628 | 0.61295 | 0.130509 | 0.473567 | 0.410999 | 0.685049 | ⋯ |
5 | 0.967143 | 0.302672 | 0.706902 | 0.0653216 | 0.644417 | 0.976457 | 0.36895 | 0.80553 | 0.704988 | 0.229158 | 0.422051 | 0.0835024 | 0.0600902 | 0.139551 | 0.851271 | 0.513128 | 0.4164 | 0.7523 | 0.173736 | 0.189301 | 0.389565 | 0.701714 | 0.743298 | 0.439659 | 0.646357 | 0.0539669 | 0.835487 | 0.632989 | 0.374562 | 0.266812 | 0.344614 | 0.0987843 | 0.205053 | 0.0121658 | 0.714131 | 0.94708 | 0.161188 | 0.31177 | 0.583715 | 0.887539 | 0.337576 | 0.0647835 | 0.78173 | 0.796417 | 0.426654 | 0.264726 | 0.10321 | 0.341679 | 0.180502 | 0.347792 | 0.142075 | 0.159654 | 0.847052 | 0.553683 | 0.548342 | 0.898875 | 0.147509 | 0.34916 | 0.49193 | 0.830147 | 0.75918 | 0.29523 | 0.552179 | 0.395406 | 0.203592 | 0.355477 | 0.128912 | 0.460629 | 0.974718 | 0.598744 | 0.536191 | 0.49289 | 0.337427 | 0.266863 | 0.598862 | 0.0480737 | 0.0973554 | 0.0952945 | 0.327903 | 0.33965 | 0.942363 | 0.424944 | 0.609847 | 0.0431133 | 0.938313 | 0.678512 | 0.246945 | 0.758531 | 0.143185 | 0.54149 | 0.290775 | 0.854097 | 0.5934 | 0.0493715 | 0.308289 | 0.932613 | 0.100065 | 0.83117 | 0.964044 | 0.532569 | ⋯ |
6 | 0.205168 | 0.683128 | 0.786909 | 0.819635 | 0.759663 | 0.588019 | 0.244011 | 0.888136 | 0.794367 | 0.360823 | 0.199259 | 0.826867 | 0.761233 | 0.0448482 | 0.856025 | 0.901308 | 0.540944 | 0.560751 | 0.352295 | 0.767541 | 0.44367 | 0.678469 | 0.19966 | 0.732522 | 0.502541 | 0.597413 | 0.656933 | 0.301444 | 0.181241 | 0.157532 | 0.714083 | 0.171852 | 0.42279 | 0.699374 | 0.13507 | 0.682853 | 0.543554 | 0.0820418 | 0.270709 | 0.835811 | 0.322101 | 0.383821 | 0.39942 | 0.21642 | 0.689587 | 0.290182 | 0.0742331 | 0.727136 | 0.208275 | 0.485364 | 0.432417 | 0.643245 | 0.124293 | 0.812727 | 0.335484 | 0.0194431 | 0.143856 | 0.254743 | 0.47993 | 0.00236364 | 0.896375 | 0.528712 | 0.600894 | 0.0705582 | 0.41725 | 0.1068 | 0.267478 | 0.745548 | 0.129849 | 0.269369 | 0.968378 | 0.73323 | 0.250992 | 0.0711908 | 0.763779 | 0.878408 | 0.695854 | 0.800981 | 0.0822155 | 0.228832 | 0.640236 | 0.739513 | 0.629127 | 0.175419 | 0.675191 | 0.851447 | 0.841781 | 0.748765 | 0.897175 | 0.518425 | 0.568143 | 0.792276 | 0.561812 | 0.59942 | 0.927527 | 0.105334 | 0.325056 | 0.0615128 | 0.668108 | 0.584317 | ⋯ |
7 | 0.527184 | 0.22961 | 0.395737 | 0.712888 | 0.286281 | 0.0784137 | 0.214371 | 0.693681 | 0.812741 | 0.850144 | 0.633995 | 0.380388 | 0.513486 | 0.104272 | 0.894109 | 0.774682 | 0.79541 | 0.951264 | 0.253308 | 0.294483 | 0.457423 | 0.15226 | 0.517339 | 0.959809 | 0.261888 | 0.562237 | 0.125323 | 0.883358 | 0.301598 | 0.0422402 | 0.355124 | 0.642006 | 0.691682 | 0.911635 | 0.53746 | 0.596286 | 0.385353 | 0.663463 | 0.581572 | 0.318694 | 0.498642 | 0.525428 | 0.549403 | 0.976401 | 0.144109 | 0.572332 | 0.331346 | 0.641637 | 0.934612 | 0.928831 | 0.642698 | 0.25521 | 0.30113 | 0.698648 | 0.788803 | 0.93338 | 0.188807 | 0.390641 | 0.916819 | 0.517061 | 0.844072 | 0.902947 | 0.322715 | 0.158913 | 0.849192 | 0.0943718 | 0.539887 | 0.853765 | 0.10927 | 0.558812 | 0.400448 | 0.971353 | 0.830961 | 0.302348 | 0.666566 | 0.17653 | 0.234576 | 0.553072 | 0.804793 | 0.661114 | 0.794541 | 0.495311 | 0.696857 | 0.268942 | 0.792013 | 0.739865 | 0.305898 | 0.784717 | 0.154662 | 0.950777 | 0.810738 | 0.991764 | 0.180139 | 0.954313 | 0.297343 | 0.436954 | 0.978416 | 0.32379 | 0.424663 | 0.877706 | ⋯ |
8 | 0.951162 | 0.963781 | 0.176482 | 0.751149 | 0.394366 | 0.945845 | 0.62704 | 0.96774 | 0.103318 | 0.674938 | 0.550164 | 0.717422 | 0.606048 | 0.295866 | 0.538989 | 0.13301 | 0.939215 | 0.418093 | 0.649185 | 0.983029 | 0.578834 | 0.0552365 | 0.154984 | 0.590466 | 0.108776 | 0.449178 | 0.248731 | 0.174672 | 0.983469 | 0.21593 | 0.820917 | 0.556617 | 0.429342 | 0.436862 | 0.435403 | 0.041061 | 0.0840443 | 0.651938 | 0.818201 | 0.839926 | 0.995171 | 0.78029 | 0.0299824 | 0.606447 | 0.467301 | 0.838808 | 0.624765 | 0.614341 | 0.740568 | 0.308466 | 0.160896 | 0.682842 | 0.556888 | 0.0769152 | 0.821016 | 0.416731 | 0.392941 | 0.23032 | 0.425042 | 0.361674 | 0.570486 | 0.383419 | 0.812583 | 0.496401 | 0.538982 | 0.633557 | 0.994718 | 0.893841 | 0.79299 | 0.961518 | 0.627831 | 0.0425985 | 0.424132 | 0.86962 | 0.78264 | 0.116515 | 0.751977 | 0.0872114 | 0.148502 | 0.149564 | 0.718268 | 0.719193 | 0.305653 | 0.737801 | 0.0611981 | 0.285703 | 0.83414 | 0.92305 | 0.0805378 | 0.436171 | 0.0366427 | 0.439864 | 0.749147 | 0.925019 | 0.519798 | 0.550835 | 0.229706 | 0.437717 | 0.45817 | 0.104438 | ⋯ |
9 | 0.536369 | 0.118179 | 0.958758 | 0.161565 | 0.722348 | 0.35534 | 0.734991 | 0.859331 | 0.161337 | 0.937488 | 0.952916 | 0.72484 | 0.551236 | 0.937295 | 0.981265 | 0.447206 | 0.945109 | 0.610249 | 0.628314 | 0.54268 | 0.0120982 | 0.850587 | 0.569141 | 0.068651 | 0.0570009 | 0.184186 | 0.228802 | 0.71854 | 0.453599 | 0.226191 | 0.315366 | 0.780655 | 0.0391108 | 0.25336 | 0.502131 | 0.334083 | 0.324837 | 0.817279 | 0.726528 | 0.967909 | 0.303368 | 0.783143 | 0.014402 | 0.048123 | 0.284366 | 0.158176 | 0.665071 | 0.512639 | 0.106427 | 0.281117 | 0.340889 | 0.336567 | 0.736707 | 0.510942 | 0.806805 | 0.0741948 | 0.452598 | 0.116301 | 0.700735 | 0.575428 | 0.656397 | 0.316217 | 0.182634 | 0.934485 | 0.602241 | 0.162334 | 0.582911 | 0.73493 | 0.636253 | 0.533644 | 0.820971 | 0.0575013 | 0.604868 | 0.579263 | 0.259423 | 0.839358 | 0.542788 | 0.428902 | 0.938494 | 0.634835 | 0.0895191 | 0.571168 | 0.637538 | 0.740823 | 0.955097 | 0.124968 | 0.149634 | 0.981464 | 0.149791 | 0.473599 | 0.262707 | 0.900575 | 0.660074 | 0.591248 | 0.237099 | 0.189881 | 0.953265 | 0.66572 | 0.43152 | 0.147092 | ⋯ |
10 | 0.711389 | 0.661071 | 0.450584 | 0.791275 | 0.0225357 | 0.582416 | 0.832604 | 0.0454298 | 0.797738 | 0.954529 | 0.372639 | 0.58931 | 0.853659 | 0.0402739 | 0.580399 | 0.350103 | 0.724024 | 0.155839 | 0.43201 | 0.124057 | 0.96206 | 0.00690846 | 0.724126 | 0.885301 | 0.879999 | 0.226902 | 0.203364 | 0.699919 | 0.704127 | 0.73523 | 0.00855446 | 0.513316 | 0.683089 | 0.23465 | 0.544689 | 0.728027 | 0.644509 | 0.044326 | 0.706912 | 0.203673 | 0.101273 | 0.260113 | 0.0854556 | 0.226671 | 0.926063 | 0.954235 | 0.146336 | 0.676873 | 0.283407 | 0.616687 | 0.566845 | 0.359589 | 0.182991 | 0.595647 | 0.599248 | 0.219112 | 0.363072 | 0.0348181 | 0.175731 | 0.735806 | 0.703981 | 0.300725 | 0.782669 | 0.721319 | 0.459928 | 0.569251 | 0.810615 | 0.575424 | 0.432598 | 0.820052 | 0.530884 | 0.1676 | 0.458789 | 0.154666 | 0.345783 | 0.000553644 | 0.961773 | 0.555288 | 0.133719 | 0.693737 | 0.495709 | 0.155871 | 0.21023 | 0.791769 | 0.0313581 | 0.638755 | 0.939785 | 0.643352 | 0.754186 | 0.457223 | 0.749805 | 0.839532 | 0.553675 | 0.282028 | 0.682128 | 0.0256411 | 0.896242 | 0.716386 | 0.949385 | 0.386708 | ⋯ |
11 | 0.103929 | 0.489357 | 0.943654 | 0.981498 | 0.906475 | 0.27605 | 0.319854 | 0.505438 | 0.27844 | 0.778794 | 0.130514 | 0.169551 | 0.596171 | 0.458798 | 0.95805 | 0.0328401 | 0.58501 | 0.834443 | 0.285972 | 0.552548 | 0.724397 | 0.744955 | 0.0488314 | 0.390639 | 0.465757 | 0.160463 | 0.70125 | 0.194893 | 0.526646 | 0.365876 | 0.0372343 | 0.726442 | 0.0175973 | 0.769848 | 0.451365 | 0.401056 | 0.420727 | 0.560409 | 0.471766 | 0.337263 | 0.145229 | 0.0791614 | 0.758629 | 0.698382 | 0.75589 | 0.629202 | 0.902828 | 0.605323 | 0.034645 | 0.491409 | 0.0471005 | 0.426415 | 0.518352 | 0.22079 | 0.0364343 | 0.965219 | 0.239123 | 0.304706 | 0.654597 | 0.311824 | 0.175638 | 0.090416 | 0.977695 | 0.644337 | 0.856763 | 0.637181 | 0.373946 | 0.98665 | 0.468634 | 0.969294 | 0.0833085 | 0.458731 | 0.822941 | 0.633568 | 0.216647 | 0.680109 | 0.0503443 | 0.217752 | 0.55126 | 0.0549194 | 0.968164 | 0.638207 | 0.464517 | 0.383313 | 0.13081 | 0.524253 | 0.327397 | 0.0178429 | 0.857295 | 0.729045 | 0.556046 | 0.317141 | 0.801009 | 0.381532 | 0.784044 | 0.787365 | 0.311033 | 0.410349 | 0.551543 | 0.933318 | ⋯ |
12 | 0.806704 | 0.597994 | 0.293338 | 0.71464 | 0.821485 | 0.421753 | 0.578613 | 0.274216 | 0.424539 | 0.379679 | 0.839805 | 0.902456 | 0.252799 | 0.563533 | 0.23067 | 0.806573 | 0.402857 | 0.7225 | 0.763932 | 0.663562 | 0.839098 | 0.026974 | 0.615383 | 0.143334 | 0.444809 | 0.367943 | 0.506325 | 0.0397944 | 0.604328 | 0.357131 | 0.422544 | 0.0409795 | 0.533926 | 0.501725 | 0.821395 | 0.374021 | 0.769564 | 0.676575 | 0.439214 | 0.759525 | 0.824962 | 0.58322 | 0.115094 | 0.906221 | 0.826537 | 0.829707 | 0.738376 | 0.418851 | 0.191779 | 0.469279 | 0.222718 | 0.523448 | 0.129809 | 0.00748883 | 0.437603 | 0.094494 | 0.493551 | 0.191608 | 0.449934 | 0.653904 | 0.684418 | 0.0223881 | 0.754044 | 0.339028 | 0.384908 | 0.262317 | 0.912023 | 0.754209 | 0.506818 | 0.825813 | 0.460699 | 0.00281085 | 0.941587 | 0.0613782 | 0.169732 | 0.403069 | 0.155711 | 0.322738 | 0.973102 | 0.0762117 | 0.569624 | 0.34844 | 0.335365 | 0.582599 | 0.107247 | 0.607799 | 0.512198 | 0.00637144 | 0.328829 | 0.714854 | 0.339914 | 0.985819 | 0.500261 | 0.521552 | 0.28377 | 0.20955 | 0.469604 | 0.652663 | 0.683621 | 0.746581 | ⋯ |
13 | 0.870539 | 0.0866983 | 0.720978 | 0.957936 | 0.410978 | 0.132653 | 0.665372 | 0.593325 | 0.0465987 | 0.135042 | 0.0570036 | 0.121709 | 0.430551 | 0.724999 | 0.46881 | 0.626354 | 0.397452 | 0.254975 | 0.517491 | 0.373471 | 0.384463 | 0.897556 | 0.257818 | 0.167195 | 0.560633 | 0.726255 | 0.478231 | 0.185042 | 0.624979 | 0.771154 | 0.912102 | 0.111355 | 0.464301 | 0.0200843 | 0.170537 | 0.334158 | 0.588337 | 0.980649 | 0.313508 | 0.800289 | 0.684536 | 0.813531 | 0.748087 | 0.607404 | 0.865426 | 0.364455 | 0.407324 | 0.719172 | 0.47475 | 0.543448 | 0.884125 | 0.495734 | 0.895777 | 0.388243 | 0.788063 | 0.449247 | 0.326003 | 0.975798 | 0.0260481 | 0.883035 | 0.55042 | 0.871367 | 0.16296 | 0.879233 | 0.590651 | 0.712671 | 0.918869 | 0.872322 | 0.624434 | 0.349621 | 0.216777 | 0.605672 | 0.226441 | 0.0657425 | 0.117751 | 0.500955 | 0.19695 | 0.0973536 | 0.0954175 | 0.0388831 | 0.646106 | 0.141421 | 0.97455 | 0.100159 | 0.541999 | 0.250148 | 0.974178 | 0.819375 | 0.613793 | 0.302014 | 0.472027 | 0.329189 | 0.177003 | 0.373291 | 0.812304 | 0.433818 | 0.597939 | 0.283612 | 0.781218 | 0.550587 | ⋯ |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋮ | ⋱ |
989 | 0.7938 | 0.470203 | 0.93174 | 0.441562 | 0.799881 | 0.214348 | 0.0726274 | 0.802777 | 0.885255 | 0.185748 | 0.584961 | 0.158936 | 0.630962 | 0.320573 | 0.00954083 | 0.282687 | 0.00115615 | 0.459449 | 0.823161 | 0.347262 | 0.977906 | 0.941176 | 0.795364 | 0.84513 | 0.844841 | 0.570935 | 0.506017 | 0.957406 | 0.367465 | 0.52825 | 0.0497843 | 0.555031 | 0.215266 | 0.821181 | 0.622771 | 0.989031 | 0.916478 | 0.0275367 | 0.771105 | 0.36826 | 0.403422 | 0.527175 | 0.162078 | 0.618589 | 0.996844 | 0.145187 | 0.736259 | 0.0143942 | 0.245268 | 0.89905 | 0.524522 | 0.174398 | 0.477557 | 0.951929 | 0.501704 | 0.8789 | 0.108654 | 0.487752 | 0.280716 | 0.849147 | 0.515193 | 0.404184 | 0.55527 | 0.870357 | 0.138122 | 0.14683 | 0.264191 | 0.784621 | 0.96551 | 0.483133 | 0.95682 | 0.917352 | 0.757755 | 0.204985 | 0.564966 | 0.17223 | 0.521205 | 0.965916 | 0.79294 | 0.100891 | 0.427409 | 0.311364 | 0.409801 | 0.660353 | 0.206206 | 0.168982 | 0.391606 | 0.71987 | 0.986182 | 0.708994 | 0.480761 | 0.815604 | 0.161045 | 0.534418 | 0.471593 | 0.837218 | 0.905971 | 0.588907 | 0.307485 | 0.253986 | ⋯ |
990 | 0.335507 | 0.67085 | 0.213006 | 0.712196 | 0.490469 | 0.833114 | 0.553566 | 0.190972 | 0.965411 | 0.773296 | 0.192789 | 0.77683 | 0.923338 | 0.849017 | 0.528022 | 0.170552 | 0.600123 | 0.364796 | 0.0772643 | 0.343971 | 0.911577 | 0.038456 | 0.236707 | 0.908477 | 0.809731 | 0.946195 | 0.397111 | 0.39011 | 0.00902716 | 0.19703 | 0.103327 | 0.213441 | 0.149104 | 0.839598 | 0.453605 | 0.164099 | 0.177608 | 0.0488344 | 0.39813 | 0.530643 | 0.540876 | 0.269065 | 0.733004 | 0.000307431 | 0.509096 | 0.158382 | 0.565569 | 0.917348 | 0.0036253 | 0.317566 | 0.603379 | 0.068804 | 0.794897 | 0.838435 | 0.416257 | 0.734179 | 0.958465 | 0.175242 | 0.91946 | 0.421771 | 0.99784 | 0.0327437 | 0.349029 | 0.224989 | 0.788229 | 0.310657 | 0.626578 | 0.343402 | 0.070237 | 0.73771 | 0.976649 | 0.752871 | 0.762147 | 0.155486 | 0.598484 | 0.52628 | 0.00475203 | 0.813156 | 0.648726 | 0.960313 | 0.917382 | 0.802392 | 0.846952 | 0.922317 | 0.0280845 | 0.408411 | 0.430433 | 0.294435 | 0.84664 | 0.822003 | 0.762444 | 0.336704 | 0.292204 | 0.371234 | 0.300788 | 0.265386 | 0.19183 | 0.156133 | 0.122974 | 0.62161 | ⋯ |
991 | 0.318699 | 0.855621 | 0.89789 | 0.186005 | 0.757165 | 0.537527 | 0.523455 | 0.810702 | 0.413644 | 0.290425 | 0.958096 | 0.578765 | 0.63485 | 0.262268 | 0.245943 | 0.497027 | 0.366844 | 0.414837 | 0.346189 | 0.406763 | 0.811995 | 0.450049 | 0.889867 | 0.0975864 | 0.698496 | 0.469912 | 0.211348 | 0.270257 | 0.511298 | 0.324656 | 0.567215 | 0.0882201 | 0.20443 | 0.849721 | 0.139073 | 0.392252 | 0.122165 | 0.545188 | 0.638874 | 0.124667 | 0.784196 | 0.682275 | 0.860103 | 0.787796 | 0.286325 | 0.867909 | 0.988358 | 0.245936 | 0.214267 | 0.0564961 | 0.207305 | 0.745315 | 0.552497 | 0.53583 | 0.519448 | 0.8416 | 0.806909 | 0.444427 | 0.175787 | 0.744833 | 0.226012 | 0.867389 | 0.523275 | 0.753208 | 0.971446 | 0.930404 | 0.0443151 | 0.739493 | 0.177052 | 0.411851 | 0.565202 | 0.219197 | 0.741081 | 0.849491 | 0.923764 | 0.160924 | 0.150755 | 0.727482 | 0.347877 | 0.996283 | 0.952941 | 0.81976 | 0.404814 | 0.949774 | 0.726088 | 0.639014 | 0.298294 | 0.490915 | 0.718063 | 0.552294 | 0.384831 | 0.549358 | 0.724398 | 0.626641 | 0.00012028 | 0.854545 | 0.195479 | 0.0122166 | 0.495049 | 0.930874 | ⋯ |
992 | 0.385526 | 0.0811559 | 0.479614 | 0.163361 | 0.0102084 | 0.649546 | 0.657604 | 0.720042 | 0.465867 | 0.94045 | 0.782627 | 0.691367 | 0.0662754 | 0.732791 | 0.574579 | 0.381599 | 0.0708026 | 0.667565 | 0.531275 | 0.584148 | 0.199724 | 0.455764 | 0.607253 | 0.910883 | 0.737946 | 0.703731 | 0.849884 | 0.495902 | 0.242281 | 0.831206 | 0.207193 | 0.064264 | 0.00131496 | 0.959967 | 0.312419 | 0.595587 | 0.201843 | 0.0075384 | 0.846742 | 0.858428 | 0.579144 | 0.434224 | 0.333309 | 0.626536 | 0.590739 | 0.0646044 | 0.462775 | 0.490585 | 0.740537 | 0.592094 | 0.137367 | 0.549106 | 0.869887 | 0.705717 | 0.717586 | 0.363997 | 0.327315 | 0.654602 | 0.285881 | 0.73758 | 0.476954 | 0.439074 | 0.772946 | 0.774809 | 0.558633 | 0.359676 | 0.249601 | 0.310865 | 0.883622 | 0.594976 | 0.590758 | 0.494172 | 0.97784 | 0.515925 | 0.196056 | 0.759124 | 0.221263 | 0.510825 | 0.68341 | 0.544371 | 0.526324 | 0.192338 | 0.647001 | 0.422677 | 0.833559 | 0.455662 | 0.775089 | 0.554409 | 0.862595 | 0.604311 | 0.348085 | 0.556081 | 0.00258774 | 0.924799 | 0.840598 | 0.239618 | 0.470394 | 0.967358 | 0.038375 | 0.878066 | ⋯ |
993 | 0.678458 | 0.557108 | 0.202757 | 0.493311 | 0.343117 | 0.922908 | 0.261242 | 0.357386 | 0.458701 | 0.418794 | 0.282234 | 0.184072 | 0.778329 | 0.119601 | 0.0760958 | 0.277757 | 0.893947 | 0.0498649 | 0.60048 | 0.626055 | 0.180351 | 0.609328 | 0.448955 | 0.575381 | 0.589045 | 0.470628 | 0.0627738 | 0.917957 | 0.139759 | 0.586553 | 0.904703 | 0.503882 | 0.98279 | 0.678252 | 0.279891 | 0.601591 | 0.312127 | 0.56454 | 0.328936 | 0.334409 | 0.888321 | 0.550556 | 0.918226 | 0.0262243 | 0.444125 | 0.387428 | 0.0830277 | 0.501643 | 0.261101 | 0.723811 | 0.305919 | 0.60967 | 0.461639 | 0.091341 | 0.553738 | 0.985621 | 0.475326 | 0.905171 | 0.446904 | 0.801063 | 0.933702 | 0.526014 | 0.491401 | 0.103181 | 0.669805 | 0.206602 | 0.929505 | 0.405348 | 0.880724 | 0.134753 | 0.919921 | 0.116427 | 0.585408 | 0.263611 | 0.148553 | 0.880533 | 0.291908 | 0.89287 | 0.81273 | 0.230538 | 0.0110548 | 0.215158 | 0.188272 | 0.31742 | 0.340102 | 0.578217 | 0.489938 | 0.760335 | 0.663621 | 0.312315 | 0.108564 | 0.651062 | 0.0475174 | 0.926033 | 0.823137 | 0.300454 | 0.308243 | 0.13366 | 0.865841 | 0.593067 | ⋯ |
994 | 0.424295 | 0.78012 | 0.577681 | 0.274228 | 0.225563 | 0.29553 | 0.395957 | 0.260142 | 0.277949 | 0.463644 | 0.242719 | 0.830015 | 0.591877 | 0.862664 | 0.41373 | 0.365988 | 0.0936457 | 0.645005 | 0.709468 | 0.167342 | 0.195179 | 0.245388 | 0.665987 | 0.652001 | 0.382251 | 0.566473 | 0.00939404 | 0.819653 | 0.288755 | 0.416332 | 0.674978 | 0.026384 | 0.580484 | 0.251398 | 0.120324 | 0.587503 | 0.362091 | 0.279315 | 0.287151 | 0.0236091 | 0.537809 | 0.82536 | 0.53102 | 0.142953 | 0.421291 | 0.630039 | 0.0882846 | 0.758558 | 0.289519 | 0.421927 | 0.833634 | 0.533817 | 0.983997 | 0.0173648 | 0.0952976 | 0.606937 | 0.690477 | 0.396225 | 0.887328 | 0.484774 | 0.421553 | 0.440067 | 0.667265 | 0.314631 | 0.399197 | 0.874844 | 0.151715 | 0.543429 | 0.562391 | 0.727646 | 0.743861 | 0.403023 | 0.593945 | 0.275799 | 0.0417108 | 0.837903 | 0.343106 | 0.335799 | 0.32336 | 0.912103 | 0.759296 | 0.551882 | 0.67661 | 0.335774 | 0.99212 | 0.886219 | 0.192232 | 0.506866 | 0.147 | 0.720016 | 0.411017 | 0.804418 | 0.833963 | 0.588914 | 0.970518 | 0.723264 | 0.173712 | 0.634828 | 0.57976 | 0.206616 | ⋯ |
995 | 0.481554 | 0.0441882 | 0.700181 | 0.317677 | 0.0846761 | 0.949135 | 0.457341 | 0.571596 | 0.0640147 | 0.781281 | 0.432866 | 0.119252 | 0.395297 | 0.286058 | 0.512666 | 0.28892 | 0.889065 | 0.392283 | 0.625326 | 0.638938 | 0.267646 | 0.753392 | 0.621089 | 0.0658605 | 0.59495 | 0.343818 | 0.770693 | 0.94488 | 0.227965 | 0.60915 | 0.296814 | 0.529701 | 0.207136 | 0.959052 | 0.383638 | 0.892441 | 0.62196 | 0.175727 | 0.549517 | 0.137818 | 0.918747 | 0.546968 | 0.668477 | 0.444744 | 0.931012 | 0.148076 | 0.507896 | 0.692397 | 0.942958 | 0.829974 | 0.506719 | 0.667715 | 0.382446 | 0.389034 | 0.592372 | 0.211112 | 0.0536174 | 0.435104 | 0.654613 | 0.0985265 | 0.724492 | 0.120103 | 0.193283 | 0.111822 | 0.136294 | 0.871079 | 0.406365 | 0.153837 | 0.492183 | 0.0707954 | 0.0270011 | 0.854367 | 0.76891 | 0.960738 | 0.256492 | 0.901838 | 0.457021 | 0.0812045 | 0.442128 | 0.397439 | 0.51791 | 0.75772 | 0.323711 | 0.779831 | 0.331937 | 0.896385 | 0.151654 | 0.752883 | 0.61111 | 0.233529 | 0.430181 | 0.162369 | 0.685753 | 0.734598 | 0.787228 | 0.327247 | 0.558746 | 0.632998 | 0.592922 | 0.329131 | ⋯ |
996 | 0.401528 | 0.658903 | 0.438309 | 0.211217 | 0.211515 | 0.441902 | 0.0280903 | 0.412717 | 0.180855 | 0.628166 | 0.949295 | 0.0234695 | 0.626841 | 0.70688 | 0.197621 | 0.216484 | 0.607782 | 0.769642 | 0.949958 | 0.249306 | 0.481014 | 0.697822 | 0.681857 | 0.920842 | 0.391799 | 0.567247 | 0.581468 | 0.275073 | 0.0414921 | 0.960641 | 0.120329 | 0.955094 | 0.84551 | 0.540647 | 0.072297 | 0.423689 | 0.545505 | 0.287923 | 0.281248 | 0.658782 | 0.481355 | 0.586361 | 0.527756 | 0.694976 | 0.772877 | 0.87787 | 0.782064 | 0.981036 | 0.393483 | 0.478884 | 0.673706 | 0.5727 | 0.518801 | 0.518592 | 0.787306 | 0.32225 | 0.668103 | 0.234636 | 0.36528 | 0.954682 | 0.938105 | 0.959623 | 0.151342 | 0.923438 | 0.313945 | 0.00595974 | 0.947103 | 0.852135 | 0.178326 | 0.546221 | 0.36211 | 0.143301 | 0.371219 | 0.72453 | 0.181077 | 0.591335 | 0.877893 | 0.717275 | 0.383012 | 0.182125 | 0.0768744 | 0.949392 | 0.437771 | 0.133059 | 0.984711 | 0.725745 | 0.92119 | 0.149844 | 0.591016 | 0.299839 | 0.793963 | 0.435089 | 0.739725 | 0.233321 | 0.0305735 | 0.945321 | 0.0268155 | 0.0756352 | 0.571333 | 0.623045 | ⋯ |
997 | 0.58985 | 0.909857 | 0.354735 | 0.442829 | 0.438322 | 0.847835 | 0.843131 | 0.727618 | 0.043537 | 0.2958 | 0.0626168 | 0.0375887 | 0.480541 | 0.187324 | 0.159619 | 0.681808 | 0.382427 | 0.7819 | 0.193915 | 0.918742 | 0.339773 | 0.996588 | 0.0750327 | 0.667384 | 0.813943 | 0.648744 | 0.778451 | 0.352497 | 0.805257 | 0.950376 | 0.286605 | 0.861475 | 0.137177 | 0.586876 | 0.747203 | 0.562411 | 0.633759 | 0.00705207 | 0.544606 | 0.818714 | 0.740385 | 0.56901 | 0.0370867 | 0.707534 | 0.119938 | 0.754687 | 0.113022 | 0.153186 | 0.571747 | 0.664875 | 0.884798 | 0.686204 | 0.336496 | 0.759956 | 0.0350065 | 0.238342 | 0.829981 | 0.778296 | 0.400111 | 0.460372 | 0.122122 | 0.0100303 | 0.514388 | 0.157644 | 0.439795 | 0.43117 | 0.169505 | 0.457244 | 0.557165 | 0.801513 | 0.70288 | 0.081918 | 0.21622 | 0.202525 | 0.973381 | 0.630359 | 0.625955 | 0.868939 | 0.0546023 | 0.386919 | 0.111863 | 0.444876 | 0.0105433 | 0.768942 | 0.94646 | 0.266636 | 0.334259 | 0.533529 | 0.342496 | 0.298016 | 0.768956 | 0.207063 | 0.587475 | 0.901736 | 0.337462 | 0.0886105 | 0.754298 | 0.595919 | 0.955853 | 0.548087 | ⋯ |
998 | 0.871944 | 0.821183 | 0.725636 | 0.521645 | 0.230773 | 0.506747 | 0.47283 | 0.902559 | 0.806089 | 0.710561 | 0.343707 | 0.291233 | 0.948628 | 0.525034 | 0.408663 | 0.784639 | 0.207999 | 0.916874 | 0.973697 | 0.594586 | 0.120818 | 0.591678 | 0.555085 | 0.269047 | 0.149341 | 0.836679 | 0.174623 | 0.0653285 | 0.247579 | 0.800229 | 0.805376 | 0.141725 | 0.70089 | 0.484247 | 0.542784 | 0.133376 | 0.706461 | 0.139403 | 0.311023 | 0.252198 | 0.110557 | 0.139412 | 0.402585 | 0.313334 | 0.485571 | 0.923477 | 0.232277 | 0.18273 | 0.216354 | 0.34598 | 0.716719 | 0.89775 | 0.85979 | 0.469057 | 0.353088 | 0.0658065 | 0.630643 | 0.194692 | 0.945528 | 0.263228 | 0.227477 | 0.898712 | 0.573841 | 0.834761 | 0.678992 | 0.0698554 | 0.606778 | 0.0664735 | 0.870948 | 0.0566045 | 0.240453 | 0.118974 | 0.814929 | 0.628905 | 0.834959 | 0.341687 | 0.303889 | 0.202003 | 0.639411 | 0.697289 | 0.00421912 | 0.799744 | 0.488837 | 0.335656 | 0.0953115 | 0.69099 | 0.928979 | 0.630334 | 0.225244 | 0.977107 | 0.108807 | 0.631775 | 0.0412976 | 0.621693 | 0.401818 | 0.599301 | 0.298324 | 0.427501 | 0.0149224 | 0.466253 | ⋯ |
999 | 0.702069 | 0.399582 | 0.377331 | 0.542735 | 0.38631 | 0.555332 | 0.742833 | 0.787603 | 0.728299 | 0.0205224 | 0.0392208 | 0.59181 | 0.134989 | 0.0577118 | 0.237958 | 0.66476 | 0.941283 | 0.801746 | 0.6268 | 0.885924 | 0.98753 | 0.555681 | 0.0225242 | 0.867217 | 0.213395 | 0.844451 | 0.45556 | 0.404991 | 0.548041 | 0.244942 | 0.378365 | 0.691279 | 0.478979 | 0.925638 | 0.583101 | 0.140049 | 0.772856 | 0.561587 | 0.00306217 | 0.549884 | 0.342337 | 0.588316 | 0.276132 | 0.589743 | 0.220219 | 0.545051 | 0.63901 | 0.0112304 | 0.789922 | 0.619918 | 0.888162 | 0.888522 | 0.0871537 | 0.170892 | 0.257639 | 0.0514265 | 0.0285141 | 0.495923 | 0.0645104 | 0.0369986 | 0.737504 | 0.352119 | 0.209841 | 0.540092 | 0.592286 | 0.027314 | 0.092139 | 0.600756 | 0.164885 | 0.409008 | 0.947233 | 0.177876 | 0.643399 | 0.742034 | 0.269545 | 0.464876 | 0.338276 | 0.464954 | 0.613252 | 0.561679 | 0.612696 | 0.313784 | 0.7675 | 0.901055 | 0.696108 | 0.0105366 | 0.266548 | 0.00447812 | 0.502627 | 0.454181 | 0.249365 | 0.361495 | 0.865166 | 0.546343 | 0.941237 | 0.859312 | 0.409052 | 0.32661 | 0.400972 | 0.768701 | ⋯ |
1000 | 0.70703 | 0.166208 | 0.45823 | 0.0682235 | 0.458422 | 0.836939 | 0.58208 | 0.721586 | 0.171453 | 0.353899 | 0.922517 | 0.394151 | 0.661091 | 0.381341 | 0.558679 | 0.847687 | 0.671701 | 0.373212 | 0.196267 | 0.353311 | 0.813622 | 0.192031 | 0.911645 | 0.452465 | 0.604791 | 0.743038 | 0.776148 | 0.00348142 | 0.332103 | 0.903827 | 0.88745 | 0.891842 | 0.203428 | 0.259077 | 0.243465 | 0.877412 | 0.614433 | 0.766943 | 0.22814 | 0.302435 | 0.849376 | 0.194502 | 0.903999 | 0.0363242 | 0.536268 | 0.886766 | 0.491461 | 0.87315 | 0.0148271 | 0.257042 | 0.210517 | 0.570115 | 0.684481 | 0.466046 | 0.367677 | 0.692357 | 0.922489 | 0.0315743 | 0.964974 | 0.523279 | 0.338024 | 0.509674 | 0.163316 | 0.362407 | 0.301272 | 0.879559 | 0.499213 | 0.631802 | 0.331658 | 0.370499 | 0.450381 | 0.604831 | 0.125916 | 0.934434 | 0.306163 | 0.207142 | 0.471049 | 0.0668867 | 0.0232546 | 0.212272 | 0.267954 | 0.398719 | 0.695908 | 0.850557 | 0.0875997 | 0.421732 | 0.330967 | 0.784816 | 0.0130859 | 0.341161 | 0.396292 | 0.412785 | 0.302764 | 0.849904 | 0.175216 | 0.687009 | 0.909296 | 0.692519 | 0.244417 | 0.0393655 | ⋯ |
"
]
},
"metadata": {},
"execution_count": 86
}
],
"cell_type": "code",
"source": [
"large_df = DataFrame(rand(1000, 10000), :auto)"
],
"metadata": {},
"execution_count": 86
},
{
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.059108 seconds (80.99 k allocations: 5.113 MiB, 88.72% compilation time: 93% of which was recompilation)\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m1000×1 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m sum \u001b[0m\n │\u001b[90m Float64 \u001b[0m\n──────┼─────────\n 1 │ 5010.73\n 2 │ 5055.37\n 3 │ 4987.2\n 4 │ 5016.2\n 5 │ 4978.86\n 6 │ 5018.0\n 7 │ 5011.07\n 8 │ 4991.09\n ⋮ │ ⋮\n 994 │ 4933.86\n 995 │ 4952.49\n 996 │ 5015.75\n 997 │ 5013.19\n 998 │ 4976.72\n 999 │ 5015.42\n 1000 │ 5054.93\n\u001b[36m985 rows omitted\u001b[0m",
"text/html": [
"1000×1 DataFrame
975 rows omitted
1 | 5010.73 |
2 | 5055.37 |
3 | 4987.2 |
4 | 5016.2 |
5 | 4978.86 |
6 | 5018.0 |
7 | 5011.07 |
8 | 4991.09 |
9 | 4932.78 |
10 | 5017.35 |
11 | 4997.14 |
12 | 4994.29 |
13 | 5020.93 |
⋮ | ⋮ |
989 | 4975.49 |
990 | 4959.65 |
991 | 4997.01 |
992 | 5003.09 |
993 | 4984.85 |
994 | 4933.86 |
995 | 4952.49 |
996 | 5015.75 |
997 | 5013.19 |
998 | 4976.72 |
999 | 5015.42 |
1000 | 5054.93 |
"
]
},
"metadata": {},
"execution_count": 87
}
],
"cell_type": "code",
"source": [
"@time select(large_df, AsTable(:) => ByRow(sum) => :sum)"
],
"metadata": {},
"execution_count": 87
},
{
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" 0.006300 seconds (19.63 k allocations: 1.071 MiB)\n"
]
},
{
"output_type": "execute_result",
"data": {
"text/plain": "\u001b[1m1000×1 DataFrame\u001b[0m\n\u001b[1m Row \u001b[0m│\u001b[1m sum \u001b[0m\n │\u001b[90m Float64 \u001b[0m\n──────┼─────────\n 1 │ 5010.73\n 2 │ 5055.37\n 3 │ 4987.2\n 4 │ 5016.2\n 5 │ 4978.86\n 6 │ 5018.0\n 7 │ 5011.07\n 8 │ 4991.09\n ⋮ │ ⋮\n 994 │ 4933.86\n 995 │ 4952.49\n 996 │ 5015.75\n 997 │ 5013.19\n 998 │ 4976.72\n 999 │ 5015.42\n 1000 │ 5054.93\n\u001b[36m985 rows omitted\u001b[0m",
"text/html": [
"1000×1 DataFrame
975 rows omitted
1 | 5010.73 |
2 | 5055.37 |
3 | 4987.2 |
4 | 5016.2 |
5 | 4978.86 |
6 | 5018.0 |
7 | 5011.07 |
8 | 4991.09 |
9 | 4932.78 |
10 | 5017.35 |
11 | 4997.14 |
12 | 4994.29 |
13 | 5020.93 |
⋮ | ⋮ |
989 | 4975.49 |
990 | 4959.65 |
991 | 4997.01 |
992 | 5003.09 |
993 | 4984.85 |
994 | 4933.86 |
995 | 4952.49 |
996 | 5015.75 |
997 | 5013.19 |
998 | 4976.72 |
999 | 5015.42 |
1000 | 5054.93 |
"
]
},
"metadata": {},
"execution_count": 88
}
],
"cell_type": "code",
"source": [
"@time select(large_df, AsTable(:) => ByRow(sum) => :sum)"
],
"metadata": {},
"execution_count": 88
},
{
"cell_type": "markdown",
"source": [
"---\n",
"\n",
"*This notebook was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).*"
],
"metadata": {}
}
],
"nbformat_minor": 3,
"metadata": {
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "1.10.5"
},
"kernelspec": {
"name": "julia-1.10",
"display_name": "Julia 1.10.5",
"language": "julia"
}
},
"nbformat": 4
}