Transformation to DataFrames#
Split-apply-combine
using DataFrames
Grouping a data frame#
groupby
x = DataFrame(id=[1, 2, 3, 4, 1, 2, 3, 4], id2=[1, 2, 1, 2, 1, 2, 1, 2], v=rand(8))
| Row | id | id2 | v |
|---|---|---|---|
| Int64 | Int64 | Float64 | |
| 1 | 1 | 1 | 0.660288 |
| 2 | 2 | 2 | 0.00519032 |
| 3 | 3 | 1 | 0.0942121 |
| 4 | 4 | 2 | 0.458129 |
| 5 | 1 | 1 | 0.26202 |
| 6 | 2 | 2 | 0.849367 |
| 7 | 3 | 1 | 0.522893 |
| 8 | 4 | 2 | 0.24494 |
groupby(x, :id)
GroupedDataFrame with 4 groups based on key: id
| Row | id | id2 | v |
|---|---|---|---|
| Int64 | Int64 | Float64 | |
| 1 | 1 | 1 | 0.660288 |
| 2 | 1 | 1 | 0.26202 |
⋮
| Row | id | id2 | v |
|---|---|---|---|
| Int64 | Int64 | Float64 | |
| 1 | 4 | 2 | 0.458129 |
| 2 | 4 | 2 | 0.24494 |
groupby(x, [])
GroupedDataFrame with 1 group based on key:
| Row | id | id2 | v |
|---|---|---|---|
| Int64 | Int64 | Float64 | |
| 1 | 1 | 1 | 0.660288 |
| 2 | 2 | 2 | 0.00519032 |
| 3 | 3 | 1 | 0.0942121 |
| 4 | 4 | 2 | 0.458129 |
| 5 | 1 | 1 | 0.26202 |
| 6 | 2 | 2 | 0.849367 |
| 7 | 3 | 1 | 0.522893 |
| 8 | 4 | 2 | 0.24494 |
gx2 = groupby(x, [:id, :id2])
GroupedDataFrame with 4 groups based on keys: id, id2
| Row | id | id2 | v |
|---|---|---|---|
| Int64 | Int64 | Float64 | |
| 1 | 1 | 1 | 0.660288 |
| 2 | 1 | 1 | 0.26202 |
⋮
| Row | id | id2 | v |
|---|---|---|---|
| Int64 | Int64 | Float64 | |
| 1 | 4 | 2 | 0.458129 |
| 2 | 4 | 2 | 0.24494 |
get the parent DataFrame
parent(gx2)
| Row | id | id2 | v |
|---|---|---|---|
| Int64 | Int64 | Float64 | |
| 1 | 1 | 1 | 0.660288 |
| 2 | 2 | 2 | 0.00519032 |
| 3 | 3 | 1 | 0.0942121 |
| 4 | 4 | 2 | 0.458129 |
| 5 | 1 | 1 | 0.26202 |
| 6 | 2 | 2 | 0.849367 |
| 7 | 3 | 1 | 0.522893 |
| 8 | 4 | 2 | 0.24494 |
back to the DataFrame, but in a different order of rows than the original
vcat(gx2...)
| Row | id | id2 | v |
|---|---|---|---|
| Int64 | Int64 | Float64 | |
| 1 | 1 | 1 | 0.660288 |
| 2 | 1 | 1 | 0.26202 |
| 3 | 2 | 2 | 0.00519032 |
| 4 | 2 | 2 | 0.849367 |
| 5 | 3 | 1 | 0.0942121 |
| 6 | 3 | 1 | 0.522893 |
| 7 | 4 | 2 | 0.458129 |
| 8 | 4 | 2 | 0.24494 |
the same as above
DataFrame(gx2)
| Row | id | id2 | v |
|---|---|---|---|
| Int64 | Int64 | Float64 | |
| 1 | 1 | 1 | 0.660288 |
| 2 | 1 | 1 | 0.26202 |
| 3 | 2 | 2 | 0.00519032 |
| 4 | 2 | 2 | 0.849367 |
| 5 | 3 | 1 | 0.0942121 |
| 6 | 3 | 1 | 0.522893 |
| 7 | 4 | 2 | 0.458129 |
| 8 | 4 | 2 | 0.24494 |
drop grouping columns when creating a data frame
DataFrame(gx2, keepkeys=false)
| Row | v |
|---|---|
| Float64 | |
| 1 | 0.660288 |
| 2 | 0.26202 |
| 3 | 0.00519032 |
| 4 | 0.849367 |
| 5 | 0.0942121 |
| 6 | 0.522893 |
| 7 | 0.458129 |
| 8 | 0.24494 |
vector of names of grouping variables
groupcols(gx2)
2-element Vector{Symbol}:
:id
:id2
and non-grouping variables
valuecols(gx2)
1-element Vector{Symbol}:
:v
group indices in parent(gx2)
groupindices(gx2)
8-element Vector{Union{Missing, Int64}}:
1
2
3
4
1
2
3
4
kgx2 = keys(gx2)
4-element DataFrames.GroupKeys{DataFrames.GroupedDataFrame{DataFrames.DataFrame}}:
GroupKey: (id = 1, id2 = 1)
GroupKey: (id = 2, id2 = 2)
GroupKey: (id = 3, id2 = 1)
GroupKey: (id = 4, id2 = 2)
You can index into a GroupedDataFrame like to a vector or to a dictionary. The second form accepts GroupKey, NamedTuple or a Tuple.
gx2
GroupedDataFrame with 4 groups based on keys: id, id2
| Row | id | id2 | v |
|---|---|---|---|
| Int64 | Int64 | Float64 | |
| 1 | 1 | 1 | 0.660288 |
| 2 | 1 | 1 | 0.26202 |
⋮
| Row | id | id2 | v |
|---|---|---|---|
| Int64 | Int64 | Float64 | |
| 1 | 4 | 2 | 0.458129 |
| 2 | 4 | 2 | 0.24494 |
k = keys(gx2)[1]
GroupKey: (id = 1, id2 = 1)
ntk = NamedTuple(k)
(id = 1, id2 = 1)
tk = Tuple(k)
(1, 1)
the operations below produce the same result and are proformant
gx2[1], gx2[k], gx2[ntk], gx2[tk]
(2×3 SubDataFrame
Row │ id id2 v
│ Int64 Int64 Float64
─────┼────────────────────────
1 │ 1 1 0.660288
2 │ 1 1 0.26202, 2×3 SubDataFrame
Row │ id id2 v
│ Int64 Int64 Float64
─────┼────────────────────────
1 │ 1 1 0.660288
2 │ 1 1 0.26202, 2×3 SubDataFrame
Row │ id id2 v
│ Int64 Int64 Float64
─────┼────────────────────────
1 │ 1 1 0.660288
2 │ 1 1 0.26202, 2×3 SubDataFrame
Row │ id id2 v
│ Int64 Int64 Float64
─────┼────────────────────────
1 │ 1 1 0.660288
2 │ 1 1 0.26202)
handling missing values
x = DataFrame(id=[missing, 5, 1, 3, missing], x=1:5)
| Row | id | x |
|---|---|---|
| Int64? | Int64 | |
| 1 | missing | 1 |
| 2 | 5 | 2 |
| 3 | 1 | 3 |
| 4 | 3 | 4 |
| 5 | missing | 5 |
by default groups include missing values and their order is not guaranteed
groupby(x, :id)
GroupedDataFrame with 4 groups based on key: id
| Row | id | x |
|---|---|---|
| Int64? | Int64 | |
| 1 | 1 | 3 |
⋮
| Row | id | x |
|---|---|---|
| Int64? | Int64 | |
| 1 | missing | 1 |
| 2 | missing | 5 |
but we can change it; now they are sorted
groupby(x, :id, sort=true, skipmissing=true)
GroupedDataFrame with 3 groups based on key: id
| Row | id | x |
|---|---|---|
| Int64? | Int64 | |
| 1 | 1 | 3 |
⋮
| Row | id | x |
|---|---|---|
| Int64? | Int64 | |
| 1 | 5 | 2 |
and now they are in the order they appear in the source data frame
groupby(x, :id, sort=false)
GroupedDataFrame with 4 groups based on key: id
| Row | id | x |
|---|---|---|
| Int64? | Int64 | |
| 1 | missing | 1 |
| 2 | missing | 5 |
⋮
| Row | id | x |
|---|---|---|
| Int64? | Int64 | |
| 1 | 3 | 4 |
Performing transformations#
by group using combine, select, select!, transform, and transform!
using Statistics
using Chain
x = DataFrame(id=rand('a':'d', 100), v=rand(100))
| Row | id | v |
|---|---|---|
| Char | Float64 | |
| 1 | d | 0.248966 |
| 2 | b | 0.677136 |
| 3 | d | 0.0390287 |
| 4 | d | 0.338539 |
| 5 | a | 0.914206 |
| 6 | c | 0.119065 |
| 7 | b | 0.572222 |
| 8 | d | 0.403218 |
| 9 | c | 0.907131 |
| 10 | d | 0.419591 |
| 11 | a | 0.438969 |
| 12 | b | 0.0641178 |
| 13 | b | 0.983015 |
| ⋮ | ⋮ | ⋮ |
| 89 | a | 0.0495181 |
| 90 | d | 0.106207 |
| 91 | a | 0.00476801 |
| 92 | c | 0.985394 |
| 93 | d | 0.122091 |
| 94 | d | 0.867029 |
| 95 | a | 0.616144 |
| 96 | d | 0.99123 |
| 97 | b | 0.13803 |
| 98 | b | 0.390887 |
| 99 | b | 0.372689 |
| 100 | b | 0.927418 |
apply a function to each group of a data frame combine keeps as many rows as are returned from the function
@chain x begin
groupby(:id)
combine(:v => mean)
end
| Row | id | v_mean |
|---|---|---|
| Char | Float64 | |
| 1 | d | 0.461606 |
| 2 | b | 0.498758 |
| 3 | a | 0.529388 |
| 4 | c | 0.546276 |
x.id2 = axes(x, 1)
Base.OneTo(100)
Select and transform keep as many rows as are in the source data frame and in correct order. Additionally, transform keeps all columns from the source.
@chain x begin
groupby(:id)
transform(:v => mean)
end
| Row | id | v | id2 | v_mean |
|---|---|---|---|---|
| Char | Float64 | Int64 | Float64 | |
| 1 | d | 0.248966 | 1 | 0.461606 |
| 2 | b | 0.677136 | 2 | 0.498758 |
| 3 | d | 0.0390287 | 3 | 0.461606 |
| 4 | d | 0.338539 | 4 | 0.461606 |
| 5 | a | 0.914206 | 5 | 0.529388 |
| 6 | c | 0.119065 | 6 | 0.546276 |
| 7 | b | 0.572222 | 7 | 0.498758 |
| 8 | d | 0.403218 | 8 | 0.461606 |
| 9 | c | 0.907131 | 9 | 0.546276 |
| 10 | d | 0.419591 | 10 | 0.461606 |
| 11 | a | 0.438969 | 11 | 0.529388 |
| 12 | b | 0.0641178 | 12 | 0.498758 |
| 13 | b | 0.983015 | 13 | 0.498758 |
| ⋮ | ⋮ | ⋮ | ⋮ | ⋮ |
| 89 | a | 0.0495181 | 89 | 0.529388 |
| 90 | d | 0.106207 | 90 | 0.461606 |
| 91 | a | 0.00476801 | 91 | 0.529388 |
| 92 | c | 0.985394 | 92 | 0.546276 |
| 93 | d | 0.122091 | 93 | 0.461606 |
| 94 | d | 0.867029 | 94 | 0.461606 |
| 95 | a | 0.616144 | 95 | 0.529388 |
| 96 | d | 0.99123 | 96 | 0.461606 |
| 97 | b | 0.13803 | 97 | 0.498758 |
| 98 | b | 0.390887 | 98 | 0.498758 |
| 99 | b | 0.372689 | 99 | 0.498758 |
| 100 | b | 0.927418 | 100 | 0.498758 |
note that combine reorders rows by group of GroupedDataFrame
@chain x begin
groupby(:id)
combine(:id2, :v => mean)
end
| Row | id | id2 | v_mean |
|---|---|---|---|
| Char | Int64 | Float64 | |
| 1 | d | 1 | 0.461606 |
| 2 | d | 3 | 0.461606 |
| 3 | d | 4 | 0.461606 |
| 4 | d | 8 | 0.461606 |
| 5 | d | 10 | 0.461606 |
| 6 | d | 16 | 0.461606 |
| 7 | d | 22 | 0.461606 |
| 8 | d | 24 | 0.461606 |
| 9 | d | 32 | 0.461606 |
| 10 | d | 34 | 0.461606 |
| 11 | d | 37 | 0.461606 |
| 12 | d | 38 | 0.461606 |
| 13 | d | 40 | 0.461606 |
| ⋮ | ⋮ | ⋮ | ⋮ |
| 89 | c | 47 | 0.546276 |
| 90 | c | 48 | 0.546276 |
| 91 | c | 53 | 0.546276 |
| 92 | c | 54 | 0.546276 |
| 93 | c | 65 | 0.546276 |
| 94 | c | 71 | 0.546276 |
| 95 | c | 73 | 0.546276 |
| 96 | c | 74 | 0.546276 |
| 97 | c | 78 | 0.546276 |
| 98 | c | 86 | 0.546276 |
| 99 | c | 87 | 0.546276 |
| 100 | c | 92 | 0.546276 |
we give a custom name for the result column
@chain x begin
groupby(:id)
combine(:v => mean => :res)
end
| Row | id | res |
|---|---|---|
| Char | Float64 | |
| 1 | d | 0.461606 |
| 2 | b | 0.498758 |
| 3 | a | 0.529388 |
| 4 | c | 0.546276 |
you can have multiple operations
@chain x begin
groupby(:id)
combine(:v => mean => :res1, :v => sum => :res2, nrow => :n)
end
| Row | id | res1 | res2 | n |
|---|---|---|---|---|
| Char | Float64 | Float64 | Int64 | |
| 1 | d | 0.461606 | 13.3866 | 29 |
| 2 | b | 0.498758 | 12.4689 | 25 |
| 3 | a | 0.529388 | 10.5878 | 20 |
| 4 | c | 0.546276 | 14.2032 | 26 |
Additional notes:
select!andtransform!perform operations in-placeThe general syntax for transformation is
source_columns => function => target_columnif you pass multiple columns to a function they are treated as positional arguments
ByRowandAsTablework exactly like discussed for operations on data frames in 05_columns.ipynbyou can automatically groupby again the result of
combine,selectetc. by passingungroup=falsekeyword argument to themsimilarly
keepkeyskeyword argument allows you to drop grouping columns from the resulting data frame
It is also allowed to pass a function to all these functions (also - as a special case, as a first argument). In this case the return value can be a table. In particular it allows for an easy dropping of groups if you return an empty table from the function.
If you pass a function you can use a do block syntax. In case of passing a function it gets a SubDataFrame as its argument.
Here is an example:
combine(groupby(x, :id)) do sdf
n = nrow(sdf)
n < 25 ? DataFrame() : DataFrame(n=n) ## drop groups with low number of rows
end
| Row | id | n |
|---|---|---|
| Char | Int64 | |
| 1 | d | 29 |
| 2 | b | 25 |
| 3 | c | 26 |
You can also produce multiple columns in a single operation:
df = DataFrame(id=[1, 1, 2, 2], val=[1, 2, 3, 4])
| Row | id | val |
|---|---|---|
| Int64 | Int64 | |
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 3 |
| 4 | 2 | 4 |
@chain df begin
groupby(:id)
combine(:val => (x -> [x]) => AsTable)
end
| Row | id | x1 | x2 |
|---|---|---|---|
| Int64 | Int64 | Int64 | |
| 1 | 1 | 1 | 2 |
| 2 | 2 | 3 | 4 |
@chain df begin
groupby(:id)
combine(:val => (x -> [x]) => [:c1, :c2])
end
| Row | id | c1 | c2 |
|---|---|---|---|
| Int64 | Int64 | Int64 | |
| 1 | 1 | 1 | 2 |
| 2 | 2 | 3 | 4 |
It is easy to unnest the column into multiple columns,
df = DataFrame(a=[(p=1, q=2), (p=3, q=4)])
select(df, :a => AsTable)
| Row | p | q |
|---|---|---|
| Int64 | Int64 | |
| 1 | 1 | 2 |
| 2 | 3 | 4 |
automatic column names generated
df = DataFrame(a=[[1, 2], [3, 4]])
select(df, :a => AsTable)
| Row | x1 | x2 |
|---|---|---|
| Int64 | Int64 | |
| 1 | 1 | 2 |
| 2 | 3 | 4 |
custom column names generated
select(df, :a => [:C1, :C2])
| Row | C1 | C2 |
|---|---|---|
| Int64 | Int64 | |
| 1 | 1 | 2 |
| 2 | 3 | 4 |
Finally, observe that one can conveniently apply multiple transformations using broadcasting:
df = DataFrame(id=repeat(1:10, 10), x1=1:100, x2=101:200)
| Row | id | x1 | x2 |
|---|---|---|---|
| Int64 | Int64 | Int64 | |
| 1 | 1 | 1 | 101 |
| 2 | 2 | 2 | 102 |
| 3 | 3 | 3 | 103 |
| 4 | 4 | 4 | 104 |
| 5 | 5 | 5 | 105 |
| 6 | 6 | 6 | 106 |
| 7 | 7 | 7 | 107 |
| 8 | 8 | 8 | 108 |
| 9 | 9 | 9 | 109 |
| 10 | 10 | 10 | 110 |
| 11 | 1 | 11 | 111 |
| 12 | 2 | 12 | 112 |
| 13 | 3 | 13 | 113 |
| ⋮ | ⋮ | ⋮ | ⋮ |
| 89 | 9 | 89 | 189 |
| 90 | 10 | 90 | 190 |
| 91 | 1 | 91 | 191 |
| 92 | 2 | 92 | 192 |
| 93 | 3 | 93 | 193 |
| 94 | 4 | 94 | 194 |
| 95 | 5 | 95 | 195 |
| 96 | 6 | 96 | 196 |
| 97 | 7 | 97 | 197 |
| 98 | 8 | 98 | 198 |
| 99 | 9 | 99 | 199 |
| 100 | 10 | 100 | 200 |
@chain df begin
groupby(:id)
combine([:x1, :x2] .=> minimum)
end
| Row | id | x1_minimum | x2_minimum |
|---|---|---|---|
| Int64 | Int64 | Int64 | |
| 1 | 1 | 1 | 101 |
| 2 | 2 | 2 | 102 |
| 3 | 3 | 3 | 103 |
| 4 | 4 | 4 | 104 |
| 5 | 5 | 5 | 105 |
| 6 | 6 | 6 | 106 |
| 7 | 7 | 7 | 107 |
| 8 | 8 | 8 | 108 |
| 9 | 9 | 9 | 109 |
| 10 | 10 | 10 | 110 |
@chain df begin
groupby(:id)
combine([:x1, :x2] .=> [minimum maximum])
end
| Row | id | x1_minimum | x2_minimum | x1_maximum | x2_maximum |
|---|---|---|---|---|---|
| Int64 | Int64 | Int64 | Int64 | Int64 | |
| 1 | 1 | 1 | 101 | 91 | 191 |
| 2 | 2 | 2 | 102 | 92 | 192 |
| 3 | 3 | 3 | 103 | 93 | 193 |
| 4 | 4 | 4 | 104 | 94 | 194 |
| 5 | 5 | 5 | 105 | 95 | 195 |
| 6 | 6 | 6 | 106 | 96 | 196 |
| 7 | 7 | 7 | 107 | 97 | 197 |
| 8 | 8 | 8 | 108 | 98 | 198 |
| 9 | 9 | 9 | 109 | 99 | 199 |
| 10 | 10 | 10 | 110 | 100 | 200 |
Aggregation of a data frame using mapcols#
x = DataFrame(rand(10, 10), :auto)
| Row | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
|---|---|---|---|---|---|---|---|---|---|---|
| Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | 0.857372 | 0.0604398 | 0.325589 | 0.338694 | 0.766581 | 0.794892 | 0.32041 | 0.0697253 | 0.944804 | 0.845565 |
| 2 | 0.66909 | 0.588814 | 0.0183379 | 0.80083 | 0.496673 | 0.0624174 | 0.48674 | 0.498084 | 0.173867 | 0.604959 |
| 3 | 0.6192 | 0.269699 | 0.0714028 | 0.0842104 | 0.0646443 | 0.875336 | 0.141855 | 0.236986 | 0.313379 | 0.0745637 |
| 4 | 0.851508 | 0.0858474 | 0.840433 | 0.19462 | 0.878192 | 0.00313476 | 0.203818 | 0.71006 | 0.963675 | 0.384315 |
| 5 | 0.865508 | 0.186952 | 0.260236 | 0.730858 | 0.92906 | 0.864217 | 0.997037 | 0.947447 | 0.739623 | 0.413455 |
| 6 | 0.00744464 | 0.803786 | 0.194378 | 0.794546 | 0.0273659 | 0.310378 | 0.622336 | 0.649783 | 0.532184 | 0.873814 |
| 7 | 0.12596 | 0.814005 | 0.0930538 | 0.185945 | 0.170085 | 0.214734 | 0.133885 | 0.597884 | 0.186861 | 0.893142 |
| 8 | 0.780113 | 0.766513 | 0.397515 | 0.432241 | 0.431676 | 0.519152 | 0.652428 | 0.0432491 | 0.237644 | 0.796689 |
| 9 | 0.338559 | 0.800977 | 0.790461 | 0.94481 | 0.163392 | 0.160652 | 0.482806 | 0.798111 | 0.583562 | 0.523887 |
| 10 | 0.188492 | 0.729708 | 0.109974 | 0.566246 | 0.260343 | 0.357419 | 0.282843 | 0.168503 | 0.905404 | 0.748549 |
mapcols(mean, x)
| Row | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
|---|---|---|---|---|---|---|---|---|---|---|
| Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | 0.530325 | 0.510674 | 0.310138 | 0.5073 | 0.418801 | 0.416233 | 0.432416 | 0.471983 | 0.5581 | 0.615894 |
Mapping rows and columns using eachcol and eachrow#
map a function over each column and return a vector
map(mean, eachcol(x))
10-element Vector{Float64}:
0.5303245428009665
0.5106740318449894
0.31013809099463696
0.5073001344079734
0.4188011876439576
0.4162331753066006
0.43241578776594264
0.47198321172017604
0.5581001809827705
0.6158937739356688
an iteration returns a Pair with column name and values
foreach(c -> println(c[1], ": ", mean(c[2])), pairs(eachcol(x)))
x1: 0.5303245428009665
x2: 0.5106740318449894
x3: 0.31013809099463696
x4: 0.5073001344079734
x5: 0.4188011876439576
x6: 0.4162331753066006
x7: 0.43241578776594264
x8: 0.47198321172017604
x9: 0.5581001809827705
x10: 0.6158937739356688
now the returned value is DataFrameRow which works as a NamedTuple but is a view to a parent DataFrame
map(r -> r.x1 / r.x2, eachrow(x))
10-element Vector{Float64}:
14.185545224261652
1.1363353510787026
2.295892834255423
9.918850731427545
4.629583189916868
0.009261975035659207
0.1547412524365301
1.0177421879049602
0.4226822659171988
0.2583112239652292
it prints like a data frame, only the caption is different so that you know the type of the object
er = eachrow(x)
er.x1 ## you can access columns of a parent data frame directly
10-element Vector{Float64}:
0.857371938863679
0.6690899418911187
0.6191999149726781
0.8515080259405692
0.8655075819602075
0.007444641674264618
0.12596020343974845
0.7801126668947531
0.33855880913447767
0.18849170323816855
it prints like a data frame, only the caption is different so that you know the type of the object
ec = eachcol(x)
| Row | x1 | x2 | x3 | x4 | x5 | x6 | x7 | x8 | x9 | x10 |
|---|---|---|---|---|---|---|---|---|---|---|
| Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | Float64 | |
| 1 | 0.857372 | 0.0604398 | 0.325589 | 0.338694 | 0.766581 | 0.794892 | 0.32041 | 0.0697253 | 0.944804 | 0.845565 |
| 2 | 0.66909 | 0.588814 | 0.0183379 | 0.80083 | 0.496673 | 0.0624174 | 0.48674 | 0.498084 | 0.173867 | 0.604959 |
| 3 | 0.6192 | 0.269699 | 0.0714028 | 0.0842104 | 0.0646443 | 0.875336 | 0.141855 | 0.236986 | 0.313379 | 0.0745637 |
| 4 | 0.851508 | 0.0858474 | 0.840433 | 0.19462 | 0.878192 | 0.00313476 | 0.203818 | 0.71006 | 0.963675 | 0.384315 |
| 5 | 0.865508 | 0.186952 | 0.260236 | 0.730858 | 0.92906 | 0.864217 | 0.997037 | 0.947447 | 0.739623 | 0.413455 |
| 6 | 0.00744464 | 0.803786 | 0.194378 | 0.794546 | 0.0273659 | 0.310378 | 0.622336 | 0.649783 | 0.532184 | 0.873814 |
| 7 | 0.12596 | 0.814005 | 0.0930538 | 0.185945 | 0.170085 | 0.214734 | 0.133885 | 0.597884 | 0.186861 | 0.893142 |
| 8 | 0.780113 | 0.766513 | 0.397515 | 0.432241 | 0.431676 | 0.519152 | 0.652428 | 0.0432491 | 0.237644 | 0.796689 |
| 9 | 0.338559 | 0.800977 | 0.790461 | 0.94481 | 0.163392 | 0.160652 | 0.482806 | 0.798111 | 0.583562 | 0.523887 |
| 10 | 0.188492 | 0.729708 | 0.109974 | 0.566246 | 0.260343 | 0.357419 | 0.282843 | 0.168503 | 0.905404 | 0.748549 |
you can access columns of a parent data frame directly
ec.x1
10-element Vector{Float64}:
0.857371938863679
0.6690899418911187
0.6191999149726781
0.8515080259405692
0.8655075819602075
0.007444641674264618
0.12596020343974845
0.7801126668947531
0.33855880913447767
0.18849170323816855
Transposing#
you can transpose a data frame using permutedims:
df = DataFrame(reshape(1:12, 3, 4), :auto)
| Row | x1 | x2 | x3 | x4 |
|---|---|---|---|---|
| Int64 | Int64 | Int64 | Int64 | |
| 1 | 1 | 4 | 7 | 10 |
| 2 | 2 | 5 | 8 | 11 |
| 3 | 3 | 6 | 9 | 12 |
df.names = ["a", "b", "c"]
3-element Vector{String}:
"a"
"b"
"c"
permutedims(df, :names)
| Row | names | a | b | c |
|---|---|---|---|---|
| String | Int64 | Int64 | Int64 | |
| 1 | x1 | 1 | 2 | 3 |
| 2 | x2 | 4 | 5 | 6 |
| 3 | x3 | 7 | 8 | 9 |
| 4 | x4 | 10 | 11 | 12 |
This notebook was generated using Literate.jl.