Transformation to DataFrames#
Split-apply-combine
using DataFrames
Grouping a data frame#
groupby
x = DataFrame(id=[1, 2, 3, 4, 1, 2, 3, 4], id2=[1, 2, 1, 2, 1, 2, 1, 2], v=rand(8))
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0425907 |
2 | 2 | 2 | 0.0509911 |
3 | 3 | 1 | 0.572115 |
4 | 4 | 2 | 0.701202 |
5 | 1 | 1 | 0.892608 |
6 | 2 | 2 | 0.0499722 |
7 | 3 | 1 | 0.611035 |
8 | 4 | 2 | 0.503039 |
groupby(x, :id)
GroupedDataFrame with 4 groups based on key: id
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0425907 |
2 | 1 | 1 | 0.892608 |
⋮
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 4 | 2 | 0.701202 |
2 | 4 | 2 | 0.503039 |
groupby(x, [])
GroupedDataFrame with 1 group based on key:
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0425907 |
2 | 2 | 2 | 0.0509911 |
3 | 3 | 1 | 0.572115 |
4 | 4 | 2 | 0.701202 |
5 | 1 | 1 | 0.892608 |
6 | 2 | 2 | 0.0499722 |
7 | 3 | 1 | 0.611035 |
8 | 4 | 2 | 0.503039 |
gx2 = groupby(x, [:id, :id2])
GroupedDataFrame with 4 groups based on keys: id, id2
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0425907 |
2 | 1 | 1 | 0.892608 |
⋮
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 4 | 2 | 0.701202 |
2 | 4 | 2 | 0.503039 |
get the parent DataFrame
parent(gx2)
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0425907 |
2 | 2 | 2 | 0.0509911 |
3 | 3 | 1 | 0.572115 |
4 | 4 | 2 | 0.701202 |
5 | 1 | 1 | 0.892608 |
6 | 2 | 2 | 0.0499722 |
7 | 3 | 1 | 0.611035 |
8 | 4 | 2 | 0.503039 |
back to the DataFrame, but in a different order of rows than the original
vcat(gx2...)
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0425907 |
2 | 1 | 1 | 0.892608 |
3 | 2 | 2 | 0.0509911 |
4 | 2 | 2 | 0.0499722 |
5 | 3 | 1 | 0.572115 |
6 | 3 | 1 | 0.611035 |
7 | 4 | 2 | 0.701202 |
8 | 4 | 2 | 0.503039 |
the same as above
DataFrame(gx2)
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0425907 |
2 | 1 | 1 | 0.892608 |
3 | 2 | 2 | 0.0509911 |
4 | 2 | 2 | 0.0499722 |
5 | 3 | 1 | 0.572115 |
6 | 3 | 1 | 0.611035 |
7 | 4 | 2 | 0.701202 |
8 | 4 | 2 | 0.503039 |
drop grouping columns when creating a data frame
DataFrame(gx2, keepkeys=false)
Row | v |
---|---|
Float64 | |
1 | 0.0425907 |
2 | 0.892608 |
3 | 0.0509911 |
4 | 0.0499722 |
5 | 0.572115 |
6 | 0.611035 |
7 | 0.701202 |
8 | 0.503039 |
vector of names of grouping variables
groupcols(gx2)
2-element Vector{Symbol}:
:id
:id2
and non-grouping variables
valuecols(gx2)
1-element Vector{Symbol}:
:v
group indices in parent(gx2)
groupindices(gx2)
8-element Vector{Union{Missing, Int64}}:
1
2
3
4
1
2
3
4
kgx2 = keys(gx2)
4-element DataFrames.GroupKeys{DataFrames.GroupedDataFrame{DataFrames.DataFrame}}:
GroupKey: (id = 1, id2 = 1)
GroupKey: (id = 2, id2 = 2)
GroupKey: (id = 3, id2 = 1)
GroupKey: (id = 4, id2 = 2)
You can index into a GroupedDataFrame
like to a vector or to a dictionary. The second form accepts GroupKey
, NamedTuple
or a Tuple
.
gx2
GroupedDataFrame with 4 groups based on keys: id, id2
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 1 | 1 | 0.0425907 |
2 | 1 | 1 | 0.892608 |
⋮
Row | id | id2 | v |
---|---|---|---|
Int64 | Int64 | Float64 | |
1 | 4 | 2 | 0.701202 |
2 | 4 | 2 | 0.503039 |
k = keys(gx2)[1]
GroupKey: (id = 1, id2 = 1)
ntk = NamedTuple(k)
(id = 1, id2 = 1)
tk = Tuple(k)
(1, 1)
the operations below produce the same result and are proformant
gx2[1], gx2[k], gx2[ntk], gx2[tk]
(2×3 SubDataFrame
Row │ id id2 v
│ Int64 Int64 Float64
─────┼─────────────────────────
1 │ 1 1 0.0425907
2 │ 1 1 0.892608, 2×3 SubDataFrame
Row │ id id2 v
│ Int64 Int64 Float64
─────┼─────────────────────────
1 │ 1 1 0.0425907
2 │ 1 1 0.892608, 2×3 SubDataFrame
Row │ id id2 v
│ Int64 Int64 Float64
─────┼─────────────────────────
1 │ 1 1 0.0425907
2 │ 1 1 0.892608, 2×3 SubDataFrame
Row │ id id2 v
│ Int64 Int64 Float64
─────┼─────────────────────────
1 │ 1 1 0.0425907
2 │ 1 1 0.892608)
handling missing values
x = DataFrame(id=[missing, 5, 1, 3, missing], x=1:5)
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | missing | 1 |
2 | 5 | 2 |
3 | 1 | 3 |
4 | 3 | 4 |
5 | missing | 5 |
by default groups include missing values and their order is not guaranteed
groupby(x, :id)
GroupedDataFrame with 4 groups based on key: id
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | 1 | 3 |
⋮
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | missing | 1 |
2 | missing | 5 |
but we can change it; now they are sorted
groupby(x, :id, sort=true, skipmissing=true)
GroupedDataFrame with 3 groups based on key: id
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | 1 | 3 |
⋮
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | 5 | 2 |
and now they are in the order they appear in the source data frame
groupby(x, :id, sort=false)
GroupedDataFrame with 4 groups based on key: id
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | missing | 1 |
2 | missing | 5 |
⋮
Row | id | x |
---|---|---|
Int64? | Int64 | |
1 | 3 | 4 |
Performing transformations#
by group using combine, select, select!, transform, and transform!
using Statistics
using Chain
x = DataFrame(id=rand('a':'d', 100), v=rand(100))
Row | id | v |
---|---|---|
Char | Float64 | |
1 | c | 0.976458 |
2 | c | 0.843507 |
3 | b | 0.0820769 |
4 | c | 0.854991 |
5 | a | 0.899427 |
6 | b | 0.396204 |
7 | a | 0.144418 |
8 | b | 0.605172 |
9 | b | 0.377043 |
10 | c | 0.756218 |
11 | b | 0.730404 |
12 | d | 0.968495 |
13 | a | 0.500889 |
⋮ | ⋮ | ⋮ |
89 | c | 0.807466 |
90 | a | 0.419191 |
91 | c | 0.661874 |
92 | d | 0.507822 |
93 | a | 0.6274 |
94 | c | 0.847145 |
95 | a | 0.237098 |
96 | b | 0.448951 |
97 | b | 0.0310993 |
98 | b | 0.137833 |
99 | d | 0.288475 |
100 | b | 0.0969202 |
apply a function to each group of a data frame combine keeps as many rows as are returned from the function
@chain x begin
groupby(:id)
combine(:v => mean)
end
Row | id | v_mean |
---|---|---|
Char | Float64 | |
1 | c | 0.687625 |
2 | b | 0.464012 |
3 | a | 0.494417 |
4 | d | 0.55287 |
x.id2 = axes(x, 1)
Base.OneTo(100)
Select and transform keep as many rows as are in the source data frame and in correct order. Additionally, transform keeps all columns from the source.
@chain x begin
groupby(:id)
transform(:v => mean)
end
Row | id | v | id2 | v_mean |
---|---|---|---|---|
Char | Float64 | Int64 | Float64 | |
1 | c | 0.976458 | 1 | 0.687625 |
2 | c | 0.843507 | 2 | 0.687625 |
3 | b | 0.0820769 | 3 | 0.464012 |
4 | c | 0.854991 | 4 | 0.687625 |
5 | a | 0.899427 | 5 | 0.494417 |
6 | b | 0.396204 | 6 | 0.464012 |
7 | a | 0.144418 | 7 | 0.494417 |
8 | b | 0.605172 | 8 | 0.464012 |
9 | b | 0.377043 | 9 | 0.464012 |
10 | c | 0.756218 | 10 | 0.687625 |
11 | b | 0.730404 | 11 | 0.464012 |
12 | d | 0.968495 | 12 | 0.55287 |
13 | a | 0.500889 | 13 | 0.494417 |
⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
89 | c | 0.807466 | 89 | 0.687625 |
90 | a | 0.419191 | 90 | 0.494417 |
91 | c | 0.661874 | 91 | 0.687625 |
92 | d | 0.507822 | 92 | 0.55287 |
93 | a | 0.6274 | 93 | 0.494417 |
94 | c | 0.847145 | 94 | 0.687625 |
95 | a | 0.237098 | 95 | 0.494417 |
96 | b | 0.448951 | 96 | 0.464012 |
97 | b | 0.0310993 | 97 | 0.464012 |
98 | b | 0.137833 | 98 | 0.464012 |
99 | d | 0.288475 | 99 | 0.55287 |
100 | b | 0.0969202 | 100 | 0.464012 |
note that combine reorders rows by group of GroupedDataFrame
@chain x begin
groupby(:id)
combine(:id2, :v => mean)
end
Row | id | id2 | v_mean |
---|---|---|---|
Char | Int64 | Float64 | |
1 | c | 1 | 0.687625 |
2 | c | 2 | 0.687625 |
3 | c | 4 | 0.687625 |
4 | c | 10 | 0.687625 |
5 | c | 16 | 0.687625 |
6 | c | 21 | 0.687625 |
7 | c | 22 | 0.687625 |
8 | c | 26 | 0.687625 |
9 | c | 28 | 0.687625 |
10 | c | 31 | 0.687625 |
11 | c | 36 | 0.687625 |
12 | c | 38 | 0.687625 |
13 | c | 44 | 0.687625 |
⋮ | ⋮ | ⋮ | ⋮ |
89 | d | 37 | 0.55287 |
90 | d | 39 | 0.55287 |
91 | d | 45 | 0.55287 |
92 | d | 50 | 0.55287 |
93 | d | 63 | 0.55287 |
94 | d | 65 | 0.55287 |
95 | d | 71 | 0.55287 |
96 | d | 74 | 0.55287 |
97 | d | 78 | 0.55287 |
98 | d | 88 | 0.55287 |
99 | d | 92 | 0.55287 |
100 | d | 99 | 0.55287 |
we give a custom name for the result column
@chain x begin
groupby(:id)
combine(:v => mean => :res)
end
Row | id | res |
---|---|---|
Char | Float64 | |
1 | c | 0.687625 |
2 | b | 0.464012 |
3 | a | 0.494417 |
4 | d | 0.55287 |
you can have multiple operations
@chain x begin
groupby(:id)
combine(:v => mean => :res1, :v => sum => :res2, nrow => :n)
end
Row | id | res1 | res2 | n |
---|---|---|---|---|
Char | Float64 | Float64 | Int64 | |
1 | c | 0.687625 | 20.6287 | 30 |
2 | b | 0.464012 | 12.0643 | 26 |
3 | a | 0.494417 | 14.3381 | 29 |
4 | d | 0.55287 | 8.29305 | 15 |
Additional notes:
select!
andtransform!
perform operations in-placeThe general syntax for transformation is
source_columns => function => target_column
if you pass multiple columns to a function they are treated as positional arguments
ByRow
andAsTable
work exactly like discussed for operations on data frames in 05_columns.ipynbyou can automatically groupby again the result of
combine
,select
etc. by passingungroup=false
keyword argument to themsimilarly
keepkeys
keyword argument allows you to drop grouping columns from the resulting data frame
It is also allowed to pass a function to all these functions (also - as a special case, as a first argument). In this case the return value can be a table. In particular it allows for an easy dropping of groups if you return an empty table from the function.
If you pass a function you can use a do
block syntax. In case of passing a function it gets a SubDataFrame
as its argument.
Here is an example:
combine(groupby(x, :id)) do sdf
n = nrow(sdf)
n < 25 ? DataFrame() : DataFrame(n=n) ## drop groups with low number of rows
end
Row | id | n |
---|---|---|
Char | Int64 | |
1 | c | 30 |
2 | b | 26 |
3 | a | 29 |
You can also produce multiple columns in a single operation:
df = DataFrame(id=[1, 1, 2, 2], val=[1, 2, 3, 4])
Row | id | val |
---|---|---|
Int64 | Int64 | |
1 | 1 | 1 |
2 | 1 | 2 |
3 | 2 | 3 |
4 | 2 | 4 |
@chain df begin
groupby(:id)
combine(:val => (x -> [x]) => AsTable)
end
Row | id | x1 | x2 |
---|---|---|---|
Int64 | Int64 | Int64 | |
1 | 1 | 1 | 2 |
2 | 2 | 3 | 4 |
@chain df begin
groupby(:id)
combine(:val => (x -> [x]) => [:c1, :c2])
end
Row | id | c1 | c2 |
---|---|---|---|
Int64 | Int64 | Int64 | |
1 | 1 | 1 | 2 |
2 | 2 | 3 | 4 |
It is easy to unnest the column into multiple columns,
df = DataFrame(a=[(p=1, q=2), (p=3, q=4)])
select(df, :a => AsTable)
Row | p | q |
---|---|---|
Int64 | Int64 | |
1 | 1 | 2 |
2 | 3 | 4 |
automatic column names generated
df = DataFrame(a=[[1, 2], [3, 4]])
select(df, :a => AsTable)
Row | x1 | x2 |
---|---|---|
Int64 | Int64 | |
1 | 1 | 2 |
2 | 3 | 4 |
custom column names generated
select(df, :a => [:C1, :C2])
Row | C1 | C2 |
---|---|---|
Int64 | Int64 | |
1 | 1 | 2 |
2 | 3 | 4 |
Finally, observe that one can conveniently apply multiple transformations using broadcasting:
df = DataFrame(id=repeat(1:10, 10), x1=1:100, x2=101:200)
Row | id | x1 | x2 |
---|---|---|---|
Int64 | Int64 | Int64 | |
1 | 1 | 1 | 101 |
2 | 2 | 2 | 102 |
3 | 3 | 3 | 103 |
4 | 4 | 4 | 104 |
5 | 5 | 5 | 105 |
6 | 6 | 6 | 106 |
7 | 7 | 7 | 107 |
8 | 8 | 8 | 108 |
9 | 9 | 9 | 109 |
10 | 10 | 10 | 110 |
11 | 1 | 11 | 111 |
12 | 2 | 12 | 112 |
13 | 3 | 13 | 113 |
⋮ | ⋮ | ⋮ | ⋮ |
89 | 9 | 89 | 189 |
90 | 10 | 90 | 190 |
91 | 1 | 91 | 191 |
92 | 2 | 92 | 192 |
93 | 3 | 93 | 193 |
94 | 4 | 94 | 194 |
95 | 5 | 95 | 195 |
96 | 6 | 96 | 196 |
97 | 7 | 97 | 197 |
98 | 8 | 98 | 198 |
99 | 9 | 99 | 199 |
100 | 10 | 100 | 200 |
@chain df begin
groupby(:id)
combine([:x1, :x2] .=> minimum)
end
Row | id | x1_minimum | x2_minimum |
---|---|---|---|
Int64 | Int64 | Int64 | |
1 | 1 | 1 | 101 |
2 | 2 | 2 | 102 |
3 | 3 | 3 | 103 |
4 | 4 | 4 | 104 |
5 | 5 | 5 | 105 |
6 | 6 | 6 | 106 |
7 | 7 | 7 | 107 |
8 | 8 | 8 | 108 |
9 | 9 | 9 | 109 |
10 | 10 | 10 | 110 |
@chain df begin
groupby(:id)
combine([:x1, :x2] .=> [minimum maximum])
end
Row | id | x1_minimum | x2_minimum | x1_maximum | x2_maximum |
---|---|---|---|---|---|
Int64 | Int64 | Int64 | Int64 | Int64 | |
1 | 1 | 1 | 101 | 91 | 191 |
2 | 2 | 2 | 102 | 92 | 192 |
3 | 3 | 3 | 103 | 93 | 193 |
4 | 4 | 4 | 104 | 94 | 194 |
5 | 5 | 5 | 105 | 95 | 195 |
6 | 6 | 6 | 106 | 96 | 196 |
7 | 7 | 7 | 107 | 97 | 197 |
8 | 8 | 8 | 108 | 98 | 198 |
9 | 9 | 9 | 109 | 99 | 199 |
10 | 10 | 10 | 110 | 100 | 200 |
Aggregation of a data frame using mapcols#
x = DataFrame(rand(10, 10), :auto)
Row | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
---|---|---|---|---|---|---|---|---|---|---|
Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | 0.268108 | 0.964137 | 0.921643 | 0.390505 | 0.668213 | 0.564771 | 0.289142 | 0.346456 | 0.836103 | 0.791965 |
2 | 0.490096 | 0.348336 | 0.829621 | 0.979889 | 0.341851 | 0.911495 | 0.31599 | 0.4777 | 0.0737997 | 0.786805 |
3 | 0.441623 | 0.337702 | 0.0479944 | 0.61183 | 0.181313 | 0.56065 | 0.898956 | 0.378213 | 0.844997 | 0.48045 |
4 | 0.98313 | 0.609842 | 0.268869 | 0.771299 | 0.817549 | 0.245241 | 0.439861 | 0.455685 | 0.370928 | 0.749379 |
5 | 0.233696 | 0.235316 | 0.209366 | 0.151025 | 0.486947 | 0.0269458 | 0.0409115 | 0.27374 | 0.0119822 | 0.763845 |
6 | 0.72524 | 0.0993523 | 0.378052 | 0.714694 | 0.89491 | 0.101437 | 0.0767853 | 0.886487 | 0.805825 | 0.58954 |
7 | 0.0513031 | 0.780559 | 0.625197 | 0.309608 | 0.0860559 | 0.0954276 | 0.127451 | 0.86544 | 0.452562 | 0.0974757 |
8 | 0.340124 | 0.77576 | 0.0194376 | 0.985434 | 0.0375154 | 0.801026 | 0.528543 | 0.333735 | 0.0716595 | 0.272976 |
9 | 0.673034 | 0.830461 | 0.902682 | 0.97038 | 0.432874 | 0.782695 | 0.400102 | 0.531082 | 0.670479 | 0.859868 |
10 | 0.932144 | 0.451464 | 0.460209 | 0.164451 | 0.539835 | 0.764005 | 0.819339 | 0.648105 | 0.745979 | 0.879255 |
mapcols(mean, x)
Row | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
---|---|---|---|---|---|---|---|---|---|---|
Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | 0.51385 | 0.543293 | 0.466307 | 0.604911 | 0.448706 | 0.485369 | 0.393708 | 0.519664 | 0.488431 | 0.627156 |
Mapping rows and columns using eachcol and eachrow#
map a function over each column and return a vector
map(mean, eachcol(x))
10-element Vector{Float64}:
0.5138499053530677
0.5432929873490393
0.4663071840543761
0.6049114902111529
0.4487063307522809
0.48536940553131674
0.39370812175433223
0.519664191175476
0.4884312090863836
0.6271559063716318
an iteration returns a Pair with column name and values
foreach(c -> println(c[1], ": ", mean(c[2])), pairs(eachcol(x)))
x1: 0.5138499053530677
x2: 0.5432929873490393
x3: 0.4663071840543761
x4: 0.6049114902111529
x5: 0.4487063307522809
x6: 0.48536940553131674
x7: 0.39370812175433223
x8: 0.519664191175476
x9: 0.4884312090863836
x10: 0.6271559063716318
now the returned value is DataFrameRow which works as a NamedTuple but is a view to a parent DataFrame
map(r -> r.x1 / r.x2, eachrow(x))
10-element Vector{Float64}:
0.2780809863660134
1.4069622297968167
1.3077289482952075
1.6121065509237753
0.9931156388639917
7.299683165942576
0.06572613008547387
0.4384391019088276
0.8104348130976737
2.0647140078056307
it prints like a data frame, only the caption is different so that you know the type of the object
er = eachrow(x)
er.x1 ## you can access columns of a parent data frame directly
10-element Vector{Float64}:
0.2681081593548079
0.4900958196613917
0.44162319542270223
0.9831302949527768
0.23369612446354215
0.7252401254413869
0.05130311782432795
0.3401235342198399
0.6730343878634002
0.9321442943265011
it prints like a data frame, only the caption is different so that you know the type of the object
ec = eachcol(x)
Row | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
---|---|---|---|---|---|---|---|---|---|---|
Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
1 | 0.268108 | 0.964137 | 0.921643 | 0.390505 | 0.668213 | 0.564771 | 0.289142 | 0.346456 | 0.836103 | 0.791965 |
2 | 0.490096 | 0.348336 | 0.829621 | 0.979889 | 0.341851 | 0.911495 | 0.31599 | 0.4777 | 0.0737997 | 0.786805 |
3 | 0.441623 | 0.337702 | 0.0479944 | 0.61183 | 0.181313 | 0.56065 | 0.898956 | 0.378213 | 0.844997 | 0.48045 |
4 | 0.98313 | 0.609842 | 0.268869 | 0.771299 | 0.817549 | 0.245241 | 0.439861 | 0.455685 | 0.370928 | 0.749379 |
5 | 0.233696 | 0.235316 | 0.209366 | 0.151025 | 0.486947 | 0.0269458 | 0.0409115 | 0.27374 | 0.0119822 | 0.763845 |
6 | 0.72524 | 0.0993523 | 0.378052 | 0.714694 | 0.89491 | 0.101437 | 0.0767853 | 0.886487 | 0.805825 | 0.58954 |
7 | 0.0513031 | 0.780559 | 0.625197 | 0.309608 | 0.0860559 | 0.0954276 | 0.127451 | 0.86544 | 0.452562 | 0.0974757 |
8 | 0.340124 | 0.77576 | 0.0194376 | 0.985434 | 0.0375154 | 0.801026 | 0.528543 | 0.333735 | 0.0716595 | 0.272976 |
9 | 0.673034 | 0.830461 | 0.902682 | 0.97038 | 0.432874 | 0.782695 | 0.400102 | 0.531082 | 0.670479 | 0.859868 |
10 | 0.932144 | 0.451464 | 0.460209 | 0.164451 | 0.539835 | 0.764005 | 0.819339 | 0.648105 | 0.745979 | 0.879255 |
you can access columns of a parent data frame directly
ec.x1
10-element Vector{Float64}:
0.2681081593548079
0.4900958196613917
0.44162319542270223
0.9831302949527768
0.23369612446354215
0.7252401254413869
0.05130311782432795
0.3401235342198399
0.6730343878634002
0.9321442943265011
Transposing#
you can transpose a data frame using permutedims
:
df = DataFrame(reshape(1:12, 3, 4), :auto)
Row | x1 | x2 | x3 | x4 |
---|---|---|---|---|
Int64 | Int64 | Int64 | Int64 | |
1 | 1 | 4 | 7 | 10 |
2 | 2 | 5 | 8 | 11 |
3 | 3 | 6 | 9 | 12 |
df.names = ["a", "b", "c"]
3-element Vector{String}:
"a"
"b"
"c"
permutedims(df, :names)
Row | names | a | b | c |
---|---|---|---|---|
String | Int64 | Int64 | Int64 | |
1 | x1 | 1 | 2 | 3 |
2 | x2 | 4 | 5 | 6 |
3 | x3 | 7 | 8 | 9 |
4 | x4 | 10 | 11 | 12 |
This notebook was generated using Literate.jl.