Transformation to DataFrames#
Split-apply-combine
using DataFrames
Grouping a dat=a frame#
x = DataFrame(id=[1, 2, 3, 4, 1, 2, 3, 4], id2=[1, 2, 1, 2, 1, 2, 1, 2], v=rand(8))
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0298167 |
2 | 2 | 2 | 0.890129 |
3 | 3 | 1 | 0.608735 |
4 | 4 | 2 | 0.648561 |
5 | 1 | 1 | 0.187759 |
6 | 2 | 2 | 0.510651 |
7 | 3 | 1 | 0.451765 |
8 | 4 | 2 | 0.847968 |
groupby(x, :id)
GroupedDataFrame with 4 groups based on key: id
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0298167 |
2 | 1 | 1 | 0.187759 |
⋮
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 4 | 2 | 0.648561 |
2 | 4 | 2 | 0.847968 |
groupby(x, [])
GroupedDataFrame with 1 group based on key:
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0298167 |
2 | 2 | 2 | 0.890129 |
3 | 3 | 1 | 0.608735 |
4 | 4 | 2 | 0.648561 |
5 | 1 | 1 | 0.187759 |
6 | 2 | 2 | 0.510651 |
7 | 3 | 1 | 0.451765 |
8 | 4 | 2 | 0.847968 |
gx2 = groupby(x, [:id, :id2])
GroupedDataFrame with 4 groups based on keys: id, id2
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0298167 |
2 | 1 | 1 | 0.187759 |
⋮
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 4 | 2 | 0.648561 |
2 | 4 | 2 | 0.847968 |
get the parent DataFrame
parent(gx2)
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0298167 |
2 | 2 | 2 | 0.890129 |
3 | 3 | 1 | 0.608735 |
4 | 4 | 2 | 0.648561 |
5 | 1 | 1 | 0.187759 |
6 | 2 | 2 | 0.510651 |
7 | 3 | 1 | 0.451765 |
8 | 4 | 2 | 0.847968 |
back to the DataFrame, but in a different order of rows than the original
vcat(gx2...)
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0298167 |
2 | 1 | 1 | 0.187759 |
3 | 2 | 2 | 0.890129 |
4 | 2 | 2 | 0.510651 |
5 | 3 | 1 | 0.608735 |
6 | 3 | 1 | 0.451765 |
7 | 4 | 2 | 0.648561 |
8 | 4 | 2 | 0.847968 |
the same
DataFrame(gx2)
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0298167 |
2 | 1 | 1 | 0.187759 |
3 | 2 | 2 | 0.890129 |
4 | 2 | 2 | 0.510651 |
5 | 3 | 1 | 0.608735 |
6 | 3 | 1 | 0.451765 |
7 | 4 | 2 | 0.648561 |
8 | 4 | 2 | 0.847968 |
drop grouping columns when creating a data frame
DataFrame(gx2, keepkeys=false)
Row | v |
---|---|
Float64 | |
1 | 0.0298167 |
2 | 0.187759 |
3 | 0.890129 |
4 | 0.510651 |
5 | 0.608735 |
6 | 0.451765 |
7 | 0.648561 |
8 | 0.847968 |
vector of names of grouping variables
groupcols(gx2)
2-element Vector{Symbol}:
:id
:id2
and non-grouping variables
valuecols(gx2)
1-element Vector{Symbol}:
:v
group indices in parent(gx2)
groupindices(gx2)
8-element Vector{Union{Missing, Int64}}:
1
2
3
4
1
2
3
4
kgx2 = keys(gx2)
4-element DataFrames.GroupKeys{GroupedDataFrame{DataFrame}}:
GroupKey: (id = 1, id2 = 1)
GroupKey: (id = 2, id2 = 2)
GroupKey: (id = 3, id2 = 1)
GroupKey: (id = 4, id2 = 2)
You can index into a GroupedDataFrame
like to a vector or to a dictionary. The second form acceps GroupKey
, NamedTuple
or a Tuple
gx2
GroupedDataFrame with 4 groups based on keys: id, id2
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0298167 |
2 | 1 | 1 | 0.187759 |
⋮
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 4 | 2 | 0.648561 |
2 | 4 | 2 | 0.847968 |
k = keys(gx2)[1]
GroupKey: (id = 1, id2 = 1)
ntk = NamedTuple(k)
(id = 1, id2 = 1)
tk = Tuple(k)
(1, 1)
the operations below produce the same result and are fast
gx2[1]
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0298167 |
2 | 1 | 1 | 0.187759 |
gx2[k]
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0298167 |
2 | 1 | 1 | 0.187759 |
gx2[ntk]
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0298167 |
2 | 1 | 1 | 0.187759 |
gx2[tk]
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0298167 |
2 | 1 | 1 | 0.187759 |
handling missing values
x = DataFrame(id=[missing, 5, 1, 3, missing], x=1:5)
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | missing | 1 |
2 | 5 | 2 |
3 | 1 | 3 |
4 | 3 | 4 |
5 | missing | 5 |
by default groups include mising values and their order is not guaranteed
groupby(x, :id)
GroupedDataFrame with 4 groups based on key: id
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | 1 | 3 |
⋮
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | missing | 1 |
2 | missing | 5 |
but we can change it; now they are sorted
groupby(x, :id, sort=true, skipmissing=true)
GroupedDataFrame with 3 groups based on key: id
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | 1 | 3 |
⋮
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | 5 | 2 |
and now they are in the order they appear in the source data frame
groupby(x, :id, sort=false)
GroupedDataFrame with 4 groups based on key: id
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | missing | 1 |
2 | missing | 5 |
⋮
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | 3 | 4 |
Performing transformations#
by group using combine, select, select!, transform, and transform!
using Statistics
using Chain
reduce the number of rows in the output
ENV["LINES"] = 15
15
x = DataFrame(id=rand('a':'d', 100), v=rand(100))
Row | id | v |
---|---|---|
Char | Float64 | |
1 | b | 0.0947633 |
2 | b | 0.444125 |
3 | c | 0.587247 |
4 | b | 0.330791 |
5 | b | 0.31695 |
6 | b | 7.83911e-5 |
7 | d | 0.728483 |
8 | b | 0.301966 |
9 | d | 0.577585 |
10 | b | 0.360327 |
11 | c | 0.803869 |
12 | c | 0.836187 |
13 | c | 0.633544 |
⋮ | ⋮ | ⋮ |
89 | c | 0.234815 |
90 | a | 0.0572505 |
91 | a | 0.421841 |
92 | a | 0.135045 |
93 | a | 0.702993 |
94 | d | 0.788392 |
95 | a | 0.759564 |
96 | c | 0.445145 |
97 | c | 0.756522 |
98 | b | 0.413141 |
99 | b | 0.617292 |
100 | d | 0.0817082 |
apply a function to each group of a data frame combine keeps as many rows as are returned from the function
@chain x begin
groupby(:id)
combine(:v => mean)
end
Row | id | v_mean |
---|---|---|
Char | Float64 | |
1 | b | 0.375926 |
2 | c | 0.538606 |
3 | d | 0.60607 |
4 | a | 0.484241 |
x.id2 = axes(x, 1)
Base.OneTo(100)
select and transform keep as many rows as are in the source data frame and in correct order additionally transform keeps all columns from the source
@chain x begin
groupby(:id)
transform(:v => mean)
end
Row | id | v | id2 | v_mean |
---|---|---|---|---|
Char | Float64 | Int64 | Float64 | |
1 | b | 0.0947633 | 1 | 0.375926 |
2 | b | 0.444125 | 2 | 0.375926 |
3 | c | 0.587247 | 3 | 0.538606 |
4 | b | 0.330791 | 4 | 0.375926 |
5 | b | 0.31695 | 5 | 0.375926 |
6 | b | 7.83911e-5 | 6 | 0.375926 |
7 | d | 0.728483 | 7 | 0.60607 |
8 | b | 0.301966 | 8 | 0.375926 |
9 | d | 0.577585 | 9 | 0.60607 |
10 | b | 0.360327 | 10 | 0.375926 |
11 | c | 0.803869 | 11 | 0.538606 |
12 | c | 0.836187 | 12 | 0.538606 |
13 | c | 0.633544 | 13 | 0.538606 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
89 | c | 0.234815 | 89 | 0.538606 |
90 | a | 0.0572505 | 90 | 0.484241 |
91 | a | 0.421841 | 91 | 0.484241 |
92 | a | 0.135045 | 92 | 0.484241 |
93 | a | 0.702993 | 93 | 0.484241 |
94 | d | 0.788392 | 94 | 0.60607 |
95 | a | 0.759564 | 95 | 0.484241 |
96 | c | 0.445145 | 96 | 0.538606 |
97 | c | 0.756522 | 97 | 0.538606 |
98 | b | 0.413141 | 98 | 0.375926 |
99 | b | 0.617292 | 99 | 0.375926 |
100 | d | 0.0817082 | 100 | 0.60607 |
note that combine reorders rows by group of GroupedDataFrame
@chain x begin
groupby(:id)
combine(:id2, :v => mean)
end
Row | id | id2 | v_mean |
---|---|---|---|
Char | Int64 | Float64 | |
1 | b | 1 | 0.375926 |
2 | b | 2 | 0.375926 |
3 | b | 4 | 0.375926 |
4 | b | 5 | 0.375926 |
5 | b | 6 | 0.375926 |
6 | b | 8 | 0.375926 |
7 | b | 10 | 0.375926 |
8 | b | 17 | 0.375926 |
9 | b | 22 | 0.375926 |
10 | b | 23 | 0.375926 |
11 | b | 25 | 0.375926 |
12 | b | 26 | 0.375926 |
13 | b | 27 | 0.375926 |
⋮ | ⋮ | ⋮ | ⋮ |
89 | a | 74 | 0.484241 |
90 | a | 75 | 0.484241 |
91 | a | 77 | 0.484241 |
92 | a | 78 | 0.484241 |
93 | a | 79 | 0.484241 |
94 | a | 80 | 0.484241 |
95 | a | 87 | 0.484241 |
96 | a | 90 | 0.484241 |
97 | a | 91 | 0.484241 |
98 | a | 92 | 0.484241 |
99 | a | 93 | 0.484241 |
100 | a | 95 | 0.484241 |
we give a custom name for the result column
@chain x begin
groupby(:id)
combine(:v => mean => :res)
end
Row | id | res |
---|---|---|
Char | Float64 | |
1 | b | 0.375926 |
2 | c | 0.538606 |
3 | d | 0.60607 |
4 | a | 0.484241 |
you can have multiple operations
@chain x begin
groupby(:id)
combine(:v => mean => :res1, :v => sum => :res2, nrow => :n)
end
Row | id | res1 | res2 | n |
---|---|---|---|---|
Char | Float64 | Float64 | Int64 | |
1 | b | 0.375926 | 10.5259 | 28 |
2 | c | 0.538606 | 14.0037 | 26 |
3 | d | 0.60607 | 11.5153 | 19 |
4 | a | 0.484241 | 13.0745 | 27 |
Additional notes:
select!
andtransform!
perform operations in-placeThe general syntax for transformation is
source_columns => function => target_column
if you pass multiple columns to a function they are treated as positional arguments
ByRow
andAsTable
work exactly like discussed for operations on data frames in 05_columns.ipynbyou can automatically groupby again the result of
combine
,select
etc. by passingungroup=false
keyword argument to themsimilarly
keepkeys
keyword argument allows you to drop grouping columns from the resulting data frame
It is also allowed to pass a function to all these functions (also - as a special case, as a first argument). In this case the return value can be a table. In particular it allows for an easy dropping of groups if you return an empty table from the function.
If you pass a function you can use a do
block syntax. In case of passing a function it gets a SubDataFrame
as its argument.
Here is an example:
combine(groupby(x, :id)) do sdf
n = nrow(sdf)
n < 25 ? DataFrame() : DataFrame(n=n) ## drop groups with low number of rows
end
Row | id | n |
---|---|---|
Char | Int64 | |
1 | b | 28 |
2 | c | 26 |
3 | a | 27 |
You can also produce multiple columns in a single operation, e.g.:
df = DataFrame(id=[1, 1, 2, 2], val=[1, 2, 3, 4])
Row | id | val |
---|---|---|
Int64 | Int64 | |
1 | 1 | 1 |
2 | 1 | 2 |
3 | 2 | 3 |
4 | 2 | 4 |
@chain df begin
groupby(:id)
combine(:val => (x -> [x]) => AsTable)
end
Row | id | x1 | x2 |
---|---|---|---|
Int64 | Int64 | Int64 | |
1 | 1 | 1 | 2 |
2 | 2 | 3 | 4 |
@chain df begin
groupby(:id)
combine(:val => (x -> [x]) => [:c1, :c2])
end
Row | id | c1 | c2 |
---|---|---|---|
Int64 | Int64 | Int64 | |
1 | 1 | 1 | 2 |
2 | 2 | 3 | 4 |
t is easy to unnest the column into multiple columns, e.g.
df = DataFrame(a=[(p=1, q=2), (p=3, q=4)])
Row | a |
---|---|
NamedTup… | |
1 | (p = 1, q = 2) |
2 | (p = 3, q = 4) |
select(df, :a => AsTable)
Row | p | q |
---|---|---|
Int64 | Int64 | |
1 | 1 | 2 |
2 | 3 | 4 |
df = DataFrame(a=[[1, 2], [3, 4]])
Row | a |
---|---|
Array… | |
1 | [1, 2] |
2 | [3, 4] |
automatic column names generated
select(df, :a => AsTable)
Row | x1 | x2 |
---|---|---|
Int64 | Int64 | |
1 | 1 | 2 |
2 | 3 | 4 |
custom column names generated
select(df, :a => [:C1, :C2])
Row | C1 | C2 |
---|---|---|
Int64 | Int64 | |
1 | 1 | 2 |
2 | 3 | 4 |
Finally, observe that one can conveniently apply multiple transformations using broadcasting:
df = DataFrame(id=repeat(1:10, 10), x1=1:100, x2=101:200)
Row | id | x1 | x2 |
---|---|---|---|
Int64 | Int64 | Int64 | |
1 | 1 | 1 | 101 |
2 | 2 | 2 | 102 |
3 | 3 | 3 | 103 |
4 | 4 | 4 | 104 |
5 | 5 | 5 | 105 |
6 | 6 | 6 | 106 |
7 | 7 | 7 | 107 |
8 | 8 | 8 | 108 |
9 | 9 | 9 | 109 |
10 | 10 | 10 | 110 |
11 | 1 | 11 | 111 |
12 | 2 | 12 | 112 |
13 | 3 | 13 | 113 |
⋮ | ⋮ | ⋮ | ⋮ |
89 | 9 | 89 | 189 |
90 | 10 | 90 | 190 |
91 | 1 | 91 | 191 |
92 | 2 | 92 | 192 |
93 | 3 | 93 | 193 |
94 | 4 | 94 | 194 |
95 | 5 | 95 | 195 |
96 | 6 | 96 | 196 |
97 | 7 | 97 | 197 |
98 | 8 | 98 | 198 |
99 | 9 | 99 | 199 |
100 | 10 | 100 | 200 |
@chain df begin
groupby(:id)
combine([:x1, :x2] .=> minimum)
end
Row | id | x1_minimum | x2_minimum |
---|---|---|---|
Int64 | Int64 | Int64 | |
1 | 1 | 1 | 101 |
2 | 2 | 2 | 102 |
3 | 3 | 3 | 103 |
4 | 4 | 4 | 104 |
5 | 5 | 5 | 105 |
6 | 6 | 6 | 106 |
7 | 7 | 7 | 107 |
8 | 8 | 8 | 108 |
9 | 9 | 9 | 109 |
10 | 10 | 10 | 110 |
@chain df begin
groupby(:id)
combine([:x1, :x2] .=> [minimum maximum])
end
Row | id | x1_minimum | x2_minimum | x1_maximum | x2_maximum |
---|---|---|---|---|---|
Int64 | Int64 | Int64 | Int64 | Int64 | |
1 | 1 | 1 | 101 | 91 | 191 |
2 | 2 | 2 | 102 | 92 | 192 |
3 | 3 | 3 | 103 | 93 | 193 |
4 | 4 | 4 | 104 | 94 | 194 |
5 | 5 | 5 | 105 | 95 | 195 |
6 | 6 | 6 | 106 | 96 | 196 |
7 | 7 | 7 | 107 | 97 | 197 |
8 | 8 | 8 | 108 | 98 | 198 |
9 | 9 | 9 | 109 | 99 | 199 |
10 | 10 | 10 | 110 | 100 | 200 |
Aggregation of a data frame using mapcols#
x = DataFrame(rand(10, 10), :auto)
Row | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
---|---|---|---|---|---|---|---|---|---|---|
Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | 0.239413 | 0.372632 | 0.391808 | 0.881766 | 0.926426 | 0.172987 | 0.886067 | 0.9445 | 0.99134 | 0.567602 |
2 | 0.127332 | 0.523684 | 0.280744 | 0.631459 | 0.100009 | 0.414213 | 0.422549 | 0.463721 | 0.0210759 | 0.569317 |
3 | 0.910273 | 0.749751 | 0.665598 | 0.661469 | 0.656725 | 0.294394 | 0.485097 | 0.411687 | 0.288302 | 0.633039 |
4 | 0.753325 | 0.642535 | 0.221339 | 0.333083 | 0.783431 | 0.274499 | 0.402339 | 0.698189 | 0.719778 | 0.198621 |
5 | 0.0387046 | 0.36916 | 0.717133 | 0.38962 | 0.658046 | 0.57561 | 0.0361161 | 0.18453 | 0.6539 | 0.490985 |
6 | 0.939608 | 0.232813 | 0.632598 | 0.625016 | 0.988554 | 0.364242 | 0.384142 | 0.837759 | 0.775318 | 0.136857 |
7 | 0.98604 | 0.646358 | 0.676885 | 0.661943 | 0.798212 | 0.60685 | 0.783194 | 0.72725 | 0.445532 | 0.311484 |
8 | 0.877028 | 0.452434 | 0.654184 | 0.592795 | 0.160648 | 0.882515 | 0.363015 | 0.852214 | 0.762508 | 0.863806 |
9 | 0.0306793 | 0.261351 | 0.292775 | 0.339821 | 0.143583 | 0.473889 | 0.229176 | 0.0522024 | 0.763665 | 0.277183 |
10 | 0.8066 | 0.818327 | 0.600803 | 0.634267 | 0.5311 | 0.610416 | 0.847606 | 0.703434 | 0.613901 | 0.974852 |
mapcols(mean, x)
Row | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
---|---|---|---|---|---|---|---|---|---|---|
Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | 0.5709 | 0.506904 | 0.513387 | 0.575124 | 0.574673 | 0.466962 | 0.48393 | 0.587549 | 0.603532 | 0.502374 |
Mapping rows and columns using eachcol and eachrow#
map a function over each column and return a vector
map(mean, eachcol(x))
10-element Vector{Float64}:
0.5709004496231695
0.5069044704575527
0.5133865376373459
0.5751238253561982
0.5746734925011687
0.46696151605594977
0.4839300017510227
0.5875487958217412
0.6035319143917144
0.5023744992952286
an iteration returns a Pair with column name and values
foreach(c -> println(c[1], ": ", mean(c[2])), pairs(eachcol(x)))
x1: 0.5709004496231695
x2: 0.5069044704575527
x3: 0.5133865376373459
x4: 0.5751238253561982
x5: 0.5746734925011687
x6: 0.46696151605594977
x7: 0.4839300017510227
x8: 0.5875487958217412
x9: 0.6035319143917144
x10: 0.5023744992952286
now the returned value is DataFrameRow which works as a NamedTuple but is a view to a parent DataFrame
map(r -> r.x1 / r.x2, eachrow(x))
10-element Vector{Float64}:
0.6424933971881048
0.24314771856095865
1.2141011230985737
1.1724263056867696
0.10484501684899052
4.035893016397789
1.5255319699420227
1.9384648268062645
0.11738731668127854
0.985669840177133
it prints like a data frame, only the caption is different so that you know the type of the object
er = eachrow(x)
er.x1 # you can access columns of a parent data frame directly
10-element Vector{Float64}:
0.23941342855307912
0.1273324638742388
0.9102731889046806
0.7533254463886194
0.038704570132973126
0.9396078861391909
0.9860398107614828
0.8770280375662959
0.03067929975280259
0.8066003641583329
it prints like a data frame, only the caption is different so that you know the type of the object
ec = eachcol(x)
Row | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
---|---|---|---|---|---|---|---|---|---|---|
Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | 0.239413 | 0.372632 | 0.391808 | 0.881766 | 0.926426 | 0.172987 | 0.886067 | 0.9445 | 0.99134 | 0.567602 |
2 | 0.127332 | 0.523684 | 0.280744 | 0.631459 | 0.100009 | 0.414213 | 0.422549 | 0.463721 | 0.0210759 | 0.569317 |
3 | 0.910273 | 0.749751 | 0.665598 | 0.661469 | 0.656725 | 0.294394 | 0.485097 | 0.411687 | 0.288302 | 0.633039 |
4 | 0.753325 | 0.642535 | 0.221339 | 0.333083 | 0.783431 | 0.274499 | 0.402339 | 0.698189 | 0.719778 | 0.198621 |
5 | 0.0387046 | 0.36916 | 0.717133 | 0.38962 | 0.658046 | 0.57561 | 0.0361161 | 0.18453 | 0.6539 | 0.490985 |
6 | 0.939608 | 0.232813 | 0.632598 | 0.625016 | 0.988554 | 0.364242 | 0.384142 | 0.837759 | 0.775318 | 0.136857 |
7 | 0.98604 | 0.646358 | 0.676885 | 0.661943 | 0.798212 | 0.60685 | 0.783194 | 0.72725 | 0.445532 | 0.311484 |
8 | 0.877028 | 0.452434 | 0.654184 | 0.592795 | 0.160648 | 0.882515 | 0.363015 | 0.852214 | 0.762508 | 0.863806 |
9 | 0.0306793 | 0.261351 | 0.292775 | 0.339821 | 0.143583 | 0.473889 | 0.229176 | 0.0522024 | 0.763665 | 0.277183 |
10 | 0.8066 | 0.818327 | 0.600803 | 0.634267 | 0.5311 | 0.610416 | 0.847606 | 0.703434 | 0.613901 | 0.974852 |
you can access columns of a parent data frame directly
ec.x1
10-element Vector{Float64}:
0.23941342855307912
0.1273324638742388
0.9102731889046806
0.7533254463886194
0.038704570132973126
0.9396078861391909
0.9860398107614828
0.8770280375662959
0.03067929975280259
0.8066003641583329
Transposing#
you can transpose a data frame using permutedims:
df = DataFrame(reshape(1:12, 3, 4), :auto)
Row | x1 | x2 | x3 | x4 |
---|---|---|---|---|
Int64 | Int64 | Int64 | Int64 | |
1 | 1 | 4 | 7 | 10 |
2 | 2 | 5 | 8 | 11 |
3 | 3 | 6 | 9 | 12 |
df.names = ["a", "b", "c"]
3-element Vector{String}:
"a"
"b"
"c"
permutedims(df, :names)
Row | names | a | b | c |
---|---|---|---|---|
String | Int64 | Int64 | Int64 | |
1 | x1 | 1 | 2 | 3 |
2 | x2 | 4 | 5 | 6 |
3 | x3 | 7 | 8 | 9 |
4 | x4 | 10 | 11 | 12 |
revert the changes for line width
delete!(ENV, "LINES")
Base.EnvDict with 18 entries:
"PATH" => "/usr/local/julia//bin:/usr/local/bin:/usr/…
"HOSTNAME" => "c0e5116d4768"
"LANG" => "C.UTF-8"
"GPG_KEY" => "7169605F62C751356D054A26A821E680E5FA6305"
"PYTHON_VERSION" => "3.12.2"
"PYTHON_PIP_VERSION" => "24.0"
"PYTHON_GET_PIP_URL" => "https://github.com/pypa/get-pip/raw/dbf0c8…
"PYTHON_GET_PIP_SHA256" => "dfe9fd5c28dc98b5ac17979a953ea550cec37ae1b4…
"JULIA_CI" => "true"
"JULIA_NUM_THREADS" => "auto"
"JULIA_CONDAPKG_BACKEND" => "Null"
"JULIA_PATH" => "/usr/local/julia/"
"JULIA_DEPOT_PATH" => "/srv/juliapkg/"
"HOME" => "/root"
"JPY_PARENT_PID" => "1"
"OPENBLAS_MAIN_FREE" => "1"
"OPENBLAS_DEFAULT_NUM_THREADS" => "1"
"COLUMNS" => "80"