Basic information about a data frame#

Let’s start by creating a DataFrame object, x, so that we can learn how to get information on that data frame.

using DataFrames
x = DataFrame(A=[1, 2], B=[1.0, missing], C=["a", "b"])
2×3 DataFrame
RowABC
Int64Float64?String
111.0a
22missingb

The standard size function works to get dimensions of the DataFrame,

size(x), size(x, 1), size(x, 2)
((2, 3), 2, 3)

as well as nrow and ncol from R.

nrow(x), ncol(x)
(2, 3)

describe gives basic summary statistics of data in your DataFrame (check out the help of describe for information on how to customize shown statistics).

describe(x)
3×7 DataFrame
Rowvariablemeanminmedianmaxnmissingeltype
SymbolUnion…AnyUnion…AnyInt64Type
1A1.511.520Int64
2B1.01.01.01.01Union{Missing, Float64}
3Cab0String

you can limit the columns shown by describe using cols keyword argument

describe(x, cols=1:2)
2×7 DataFrame
Rowvariablemeanminmedianmaxnmissingeltype
SymbolFloat64RealFloat64RealInt64Type
1A1.511.520Int64
2B1.01.01.01.01Union{Missing, Float64}

names will return the names of all columns as strings

names(x)
3-element Vector{String}:
 "A"
 "B"
 "C"

you can also get column names with a given element type (eltype):

names(x, String)
1-element Vector{String}:
 "C"

use propertynames to get a vector of Symbols:

propertynames(x)
3-element Vector{Symbol}:
 :A
 :B
 :C

eltype on eachcol(x) returns element types of columns:

eltype.(eachcol(x))
3-element Vector{Type}:
 Int64
 Union{Missing, Float64}
 String

Here we create some large DataFrame

y = DataFrame(rand(1:10, 1000, 10), :auto)
1000×10 DataFrame
975 rows omitted
Rowx1x2x3x4x5x6x7x8x9x10
Int64Int64Int64Int64Int64Int64Int64Int64Int64Int64
1210732103912
226197108453
3310861083574
49710439310410
54321195568
610388534798
7103749109495
818812410889
9658101151024
101547251341
1191107197132
125997895445
1356213810479
989272710478108
9907865286498
99129516810411
992875473108610
9934798177131
9946458512153
99516675741082
99631094109101089
997617101013851
99893456310769
999545417861010
100089479810832

and then we can use first to peek into its first few rows

first(y, 5)
5×10 DataFrame
Rowx1x2x3x4x5x6x7x8x9x10
Int64Int64Int64Int64Int64Int64Int64Int64Int64Int64
1210732103912
226197108453
3310861083574
49710439310410
54321195568

and last to see its bottom rows.

last(y, 3)
3×10 DataFrame
Rowx1x2x3x4x5x6x7x8x9x10
Int64Int64Int64Int64Int64Int64Int64Int64Int64Int64
193456310769
2545417861010
389479810832

Using first and last without number of rows will return a first/last DataFrameRow in the DataFrame

first(y)
DataFrameRow (10 columns)
Rowx1x2x3x4x5x6x7x8x9x10
Int64Int64Int64Int64Int64Int64Int64Int64Int64Int64
1210732103912
last(y)
DataFrameRow (10 columns)
Rowx1x2x3x4x5x6x7x8x9x10
Int64Int64Int64Int64Int64Int64Int64Int64Int64Int64
100089479810832

Displaying large data frames#

Create a wide and tall data frame:

df = DataFrame(rand(100, 100), :auto)
100×100 DataFrame
75 rows omitted
Rowx1x2x3x4x5x6x7x8x9x10x11x12x13x14x15x16x17x18x19x20x21x22x23x24x25x26x27x28x29x30x31x32x33x34x35x36x37x38x39x40x41x42x43x44x45x46x47x48x49x50x51x52x53x54x55x56x57x58x59x60x61x62x63x64x65x66x67x68x69x70x71x72x73x74x75x76x77x78x79x80x81x82x83x84x85x86x87x88x89x90x91x92x93x94x95x96x97x98x99x100
Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64
10.8538710.7987860.5630230.3353430.8959710.1549780.3119220.260830.2913790.5162780.7818790.8223470.1137250.08660170.1031470.7598290.3152480.5736610.06634270.5755850.1078550.7463380.127190.6703930.574760.5593020.4882570.2117080.1637740.4466560.5504040.7636290.6445650.4914220.2161040.9050760.5496040.3104970.5984020.8946620.3205830.03506670.1707970.809720.1529920.8948130.9889430.9683590.8780630.2818670.9410410.7374250.0005543950.580920.8787030.6961860.05780880.2609420.9946140.2025780.05369830.9910490.7973210.3933380.3154290.7965520.2558130.2709580.08970350.3269060.5547670.5094210.7599780.5926310.1111630.3886940.360280.7494120.4495420.3976830.7126620.9812970.2447770.9064690.7829590.6315240.3133640.9290420.3846260.7949230.9632040.5036960.8039290.1819890.9765390.8599720.5988650.8866390.2024030.0989045
20.2594090.5878950.7775520.2894750.6992640.2963650.2252220.8567510.6488380.3852470.001504610.4850030.4685260.1521950.7294410.1956130.9155070.6394180.1764880.5012740.2716010.2911370.6286840.1407980.8811910.2871420.1183910.108790.4191420.4972120.1911590.9379570.4625920.8501170.5798090.1246040.590340.0828940.6314130.7687060.02988650.3093530.008876310.4020420.4372630.08834650.7845180.3565610.02132560.9252040.9381590.7270560.4941780.5169240.7653350.7402880.157740.5151290.8250540.8679470.09236440.4961280.5923530.6855360.1880590.2475420.7451130.6394440.3075910.1108450.6908650.9151390.5172470.1659610.2224060.1469070.5036150.8926070.05600480.6746820.3694950.788710.1837310.8368660.4773490.7657990.6214650.7888030.03315350.9468880.8244390.5695690.09215290.9742670.3839320.08117270.4444020.7772210.5817660.746471
30.9367280.5885920.03149270.1107260.3464440.6342040.5544180.6497420.4377050.2363860.5941630.04656040.7380110.0269720.8076920.4910460.3261860.1728550.588510.2499270.1800310.3072030.9842480.6882970.3615980.1232380.8903450.3126140.08428410.3431170.4003120.6728570.2217990.8561090.9643110.2877250.4925840.1638460.01847330.6564930.5003790.3894980.1025810.3087310.7919260.4889780.7076160.2525810.6344050.6975840.619630.5330250.9303890.6584310.1467890.3628890.7804440.1814090.6885050.8985190.2924940.3705630.8971920.9332470.4584870.08631060.3864590.5007030.5247740.2087740.732810.0221860.1304820.1088750.3750350.6749570.4997820.3087210.1371690.5793750.9527610.09383990.6253880.475180.06509720.6259750.6216940.6113980.1519480.3535240.9997030.562430.1897770.8048910.2765940.5992110.2536220.9544250.1245010.306698
40.5455090.8196880.6609930.1051340.9000760.1198260.7122590.3892150.9414790.4978430.1077770.7499380.08623460.2682390.7187430.1515480.5276480.2873480.9496430.2045420.4526110.8957870.4310440.1916760.8049790.3302140.1283420.04961540.6585630.7047330.2874210.5067230.8921360.9974560.2241580.2519040.5666750.1177680.4156380.5067220.8062780.1262690.1715440.7546830.2492480.1135950.5346760.577020.2179910.1136020.9546330.1456930.9821850.635780.8689650.1517420.558580.9559420.4222410.2728640.07371910.1422520.6488370.8728270.9879240.01515710.1070860.470130.4779470.7532030.8327910.5109990.4996980.4687260.2443930.0987980.5799850.4517090.854910.7920320.8495310.6650310.4286770.00200010.4234260.5118570.8283730.3862840.1874750.8171760.1916650.5169440.7128140.5616230.5549720.8699530.3057260.511840.06853620.958238
50.7610570.3352190.2453050.4396330.5183180.4693560.8492650.6288170.2689960.6720480.7428130.3180450.6894070.2555230.6104750.8523840.3797750.5425110.9023840.4834690.09216180.4402010.7570490.954970.2799250.1451570.05014840.1070290.07872760.9450310.9531220.5577740.6985270.9834440.5722260.8499760.144180.09173930.5158410.3068180.6656110.9092110.5907010.9614460.4411540.3604120.8171430.1580970.0156970.1534380.5990040.7291350.5379790.1305820.9560810.9711860.5449870.7999360.2470860.5635680.07754330.4838610.1615920.127320.1338550.8988780.4107660.7027990.3578370.8051150.4334090.1805640.3143640.6453550.3023670.3633640.8432410.03950270.225170.4602550.5975040.1147310.1184310.8922110.814710.7440030.717290.04710280.9004350.1563520.09008570.925970.1901730.9991330.2363320.3978630.2923020.7222860.1629830.709479
60.4140580.03081650.3823440.7762710.669710.9172660.4064340.5548580.8909630.9255230.5976460.3133120.05341330.1747670.5028530.4222570.3915450.6053140.8944590.09908780.09529460.8351150.6939760.001196850.8510340.08660750.8521350.9288350.4779210.2173020.5918070.9857820.3322290.8919990.46080.4709040.2311840.1957530.5946760.3524240.7518690.516460.9975890.8899250.3291350.5173960.1082360.0877380.921720.7837820.220040.7546730.1821910.3486520.8172110.2020120.370530.5995640.2338870.9990240.8140920.7032810.9649960.5397480.8333890.1291310.7864330.5009580.06224390.5572190.6555520.9658690.8789430.4186450.01386140.7849730.8547650.7788570.6643560.8715290.5712740.8768310.8611370.995930.4837630.9449950.2507040.8797660.4318010.2834230.2751850.4483150.7174880.1964450.4953250.3716330.7471830.7527770.9838640.914233
70.869160.6977720.85480.8474690.7237110.8923430.4626280.7994990.3846270.8515940.973690.5760470.5313490.6176690.1898120.459080.5498980.8297580.5308010.6282890.2856980.03804840.2567530.8366720.3837680.1963370.6129230.2187970.2173930.6081720.03331460.7831410.1295260.4320080.3631690.7971980.790280.8669740.6236330.3533420.3875860.2178360.8182510.005388370.2691990.8111030.208280.7157510.5051160.3051380.3057720.6006460.2944570.04702290.03846120.4274090.8427660.1300340.2378630.3444610.7977120.522140.7145420.6087930.9881740.725250.7337950.1239120.8890210.9520260.205090.6621930.4022670.7934820.8153960.8805160.6083540.4431170.1856870.985220.4509360.6542670.02478420.8481180.8010970.9043430.4340250.1700980.2868420.5939290.03409740.166780.4972140.5079570.2093280.03060.5794040.6634930.6085280.580555
80.532080.591860.4644570.1933310.1226910.6454270.05014670.1660210.9084370.5194810.6322080.7068180.7477660.0711870.505120.9500570.3153580.3469110.9068280.6235460.5992260.5429170.5039720.4035120.1981340.8680210.04816560.4304460.451150.7390990.2950380.0962610.9425290.6931150.675330.3625140.1527230.5566940.01057690.8431920.3566670.8352990.5194530.1959770.6838120.9828440.07019150.5337610.9404190.6827180.7221350.5561130.2026760.8527930.2840920.7354790.7325910.178830.1089140.5826120.1636380.7739710.9217510.6954670.3442530.7570710.9455470.2779930.487130.8710860.08561310.9139860.9011050.7200380.4333430.1729970.6096380.2821660.8399840.2778070.7475190.5616930.08710660.525720.4201030.04475890.7948510.6762770.505770.4636460.3635060.2053920.2521360.6537020.3935080.5163380.4270460.1143840.04757140.0164495
90.5273540.698450.1768880.9681540.6142750.8256520.5737670.4509720.08345470.1844330.3451150.1558440.7255420.1637640.8089830.3861050.3826320.4028030.8330690.592960.2890810.7220970.522780.5594710.8329440.1244930.2480730.6466980.01743640.9665090.2751160.5092530.08318020.9041840.8115780.2571050.3554720.2767570.720550.6978610.5475510.3863350.2717050.4253160.3703930.4580710.196020.6810010.1273590.202120.1034890.03832890.04647010.9449670.08501120.9947330.4562410.2337920.6056120.735110.9267490.9083240.970410.1329250.8603780.9545390.5551160.6630370.3025090.6704670.3822910.0102160.04096440.7468270.3209190.9039630.5073740.6781160.5155180.1295410.3400510.3595870.144030.4835210.09079060.8263010.2304090.9913070.149450.175250.6342550.001376270.1315590.8273690.5855280.4439110.4921690.09610120.7748760.603431
100.5113760.7411910.4029930.1581190.3608760.2703420.6461210.1348460.3337690.1606590.3672430.7277580.1477050.05633950.1426840.004837040.4702730.8349730.7577490.08807480.1706540.8361920.490370.07838370.9121050.05056510.6669340.5362060.997640.3519570.8178360.06772690.5919330.9691230.4793950.3131820.1368740.06264760.3853590.2680440.2946380.2019540.7052160.3026640.3434270.5152310.544460.3277180.3055330.03229840.4339220.04562380.09507050.8001010.3169130.6415610.8558440.08552450.9216230.6177370.8509980.281870.8393310.4538120.2892740.4290080.9625290.6111280.1292090.06843430.766180.1547640.1555420.09409050.7952050.778170.7636360.7837810.539850.4314150.1014470.230190.7545220.3011010.2224120.8548780.8554170.2013040.4283250.7668930.300650.8121870.7538890.6298120.8190210.6853610.8397010.1087870.5575650.709891
110.6899660.7382780.4131170.953560.04444130.7783650.3989750.8392660.3202340.4487450.5910540.7187610.5073620.5047090.9380230.1128440.5877530.9719170.3690040.02412560.214230.5496570.1293490.7669340.2063420.1251790.6411540.325840.5112830.2296150.08169130.8921950.6376130.1787810.5164050.9464560.1712150.6015720.3162260.4314850.3996250.04205290.8986890.9286980.1601870.649420.5671740.08234870.1629220.4741340.06864950.8992170.8606450.6253190.7397130.545550.8917690.5424760.2827830.8606360.2793280.7912620.8259470.6562160.6009010.3669290.03246210.9329320.5869080.05013260.804520.2892890.139160.502460.04356280.08959190.8516470.8977230.2417440.540570.3424420.4532560.7077680.6601340.6958320.6881180.2882860.02584970.6692590.7544050.6227740.9074760.7745760.39770.9251740.3360450.1024670.3991470.09136930.141262
120.2179280.9397250.7441490.4110190.5218260.4463850.5429910.6272980.6588120.9355420.4844950.6771980.1085920.875450.9372730.2115050.8181090.6796470.6466340.8846870.2998130.0279290.4437690.6671320.4034210.4096170.6122270.6377860.9195710.6335950.8787650.4345340.9005530.3629690.5291640.3032780.7782460.7730670.3021580.06768740.2021580.2687210.8375140.9919160.4683250.5952160.6254180.1605780.59030.4034870.547980.4670630.12240.7824230.9861280.8796070.919990.2266380.1183460.4554210.2755920.273760.8942130.4180530.7064660.7692150.290350.5007860.2042120.06182580.3096220.4066750.2250220.282020.3346830.2118380.5170140.2044720.4488460.4743850.4412140.6469260.05064280.9653330.2443040.229920.1577760.4852970.5665790.7475450.732930.7584940.8274130.6014960.9604580.2464290.3605180.3434580.05388360.037407
130.8678130.3457070.4860860.8982740.7564540.4023570.6706540.2091230.7926480.5142710.5160440.2622050.03380910.8649620.01375870.2366850.6431910.9505170.06700640.1264940.5450510.1470740.6489580.6703710.8862850.3018870.4908150.4070930.927140.475910.1519080.5322840.7150340.5889050.6625280.8475280.1880140.1852880.8167930.3276170.7892670.05252910.1646820.1713510.7311310.04339990.4126360.7439170.4344160.6133910.9031990.984230.5434870.2851010.5805250.9957340.6295470.70710.3539050.6770920.2770370.1874940.462310.1288620.339870.588090.3510490.09331480.4759670.1159560.8389010.2917660.1231060.002633060.8494380.6063760.9301740.0595280.3159190.3885540.5510120.05378880.9925320.5814550.3335060.04044840.9858020.3292680.5386020.4806460.4107650.4269440.3767530.2442440.7890290.8168520.9033570.6840550.4330410.312105
890.1691210.7154880.5051730.05992410.9159590.7878380.974440.4927750.8186760.7713480.9856550.5919450.7847280.02285460.9499060.1556110.131920.04497920.8211260.8860090.7233190.1581580.07454310.0215230.2946750.6206110.06199410.1929960.002729850.4747860.08078040.3309740.9137690.233380.9132420.5047550.05453570.4595290.003415120.7610610.7952140.5291670.01095410.5375230.7413190.2635830.8608790.4492890.5451130.3391370.6126220.5475980.3653160.9697930.5045280.1928480.9001060.1055270.8152930.2655280.4849450.7313660.9605920.2275780.9045520.6839170.172090.2815710.2934820.6304610.8945090.2144770.3609690.2755340.4580090.8134770.8224840.2862030.3865720.1105580.9106690.4551710.8579890.2339830.9757140.1722110.7742140.2349080.2576770.8050890.191510.9412090.4651980.6270710.4675590.3262360.8239860.8932230.5896620.51406
900.4672110.1266090.502950.81720.1374920.4898730.3021640.05271990.02022050.7970730.933630.08210280.8044190.2761560.4895290.9572370.8778350.5063420.9790060.359180.5765920.5488790.5755480.4315690.2582890.5697610.5654270.9749710.8755060.02208940.5676620.05968220.06170950.5441460.122210.5617240.399650.8559920.1540530.1718010.04605140.5120310.505920.6889880.5303210.4562090.464550.3341630.908970.896670.6636340.4154920.136340.4924750.980730.1228070.9720020.5863880.5073280.8421980.3497690.2747120.06712730.1308340.8200270.8061550.9315320.7621930.5889310.5734560.7241280.584560.5519590.5675150.7857530.5989040.9370270.314150.6881230.05237360.2029770.04133360.9080890.7874340.7106460.3195930.05417580.7777110.8431260.8103110.3402390.6775970.9291480.7350420.8117390.1623420.9839720.1548660.4850060.941732
910.6843520.1200260.3185740.2022330.7805880.2411730.4773660.160230.7570590.1785630.9510750.2745060.4246420.06238920.3834170.8648710.8730960.02142350.783290.5014150.3794140.6326860.6383220.1641040.4775620.5110170.2689590.4404540.215480.3661120.004944590.2437840.9253810.3064170.6113270.05080340.9426060.3131910.3858640.9721760.8321150.9092620.5081230.8903350.0179560.1714630.2605270.9253330.7360150.9434430.1220990.08714860.5229660.2571990.06807520.3306130.8864110.1233690.05180780.7999530.345620.1125890.4197670.2789490.365290.5566860.4732620.5591830.3785330.6524980.1394550.5001480.1076060.780220.7014230.4842870.3737590.7485890.2366440.9786290.7982260.2286210.1560390.5096370.9243190.134730.2271920.3514340.7023650.07949330.770740.4024310.9270660.8667460.6636310.4779480.962480.3137640.06815230.761824
920.4504170.3196570.5499450.608870.4960680.1206920.7330960.378830.8597450.225580.4595580.658480.9957730.1436260.6834320.2759970.3629830.5760120.546210.9789660.6063870.7665220.1189460.4655260.230490.2610650.6178760.4657820.6116420.9466880.6924710.3943230.7960340.4357340.7475010.3641440.8231770.5948660.7987920.2879390.4150230.3164730.1084880.9231350.875710.7383050.7990170.8973360.6506510.8012480.1611370.9868310.7035760.4925340.9672250.5599450.976740.3477650.1505880.03634790.6402610.09974760.6644070.460450.5132680.7717270.5820910.9863280.7818280.09100590.906340.2718720.8775370.7080660.2967150.1922670.07678590.7281410.01804660.1428080.9964960.1574220.3266470.8593250.1780130.9994150.678590.7490220.1304580.3662760.5314590.3376240.09734560.06388480.3570980.3171190.7812240.4717970.2125490.0825412
930.9228410.9979880.9121990.1471430.1607190.9360320.4538640.1050580.3381490.3772590.6908250.7012230.8543340.5419510.0576680.7806450.3516550.01318040.4963690.9971180.9283060.2777890.7591490.4352890.01841010.5739410.4445350.1543130.06850610.3114040.8254340.3636740.820240.3022190.5007440.9124660.4724940.4401660.6855880.004669050.4371810.1293530.3598290.2092060.1966560.3490870.01174680.1357630.4743930.6339190.2972020.5402210.5034360.664340.9333780.163270.5335020.5273280.4235770.1511290.9463790.5279320.7601310.807260.6928420.1126770.3340750.3469040.9354050.3904630.3270940.9799550.5855630.7557190.5355020.1064990.9383470.3116160.3193050.1551530.4523720.9438480.03217370.548340.6538370.4004680.4345370.9976470.888620.01024790.1385410.7544610.2065470.5847180.8799850.9629090.1397980.5793810.6387390.912144
940.2644610.1515720.2135740.2420990.2255070.9143750.7754160.5352870.1844510.08767170.9997050.8398260.7435780.982320.8420150.131020.2871230.4718770.3695690.3779770.9754250.5297030.05349940.643570.08789490.1947280.9993350.7177490.8896440.4846230.9767060.6312860.9283650.0003018310.6972070.8953090.7033550.1325810.5505980.7646830.2538150.3392050.3742310.336990.7703820.5733380.7355850.8333480.4650790.7432210.07277810.3038790.9227670.792990.7978390.05551010.8443080.2745510.01486290.4066540.6607370.8740190.352310.3941850.6519980.5984060.5407070.4518290.2731160.9252660.9494530.4323590.2637030.7550190.4793110.1759790.5165770.03034460.6079410.3969220.2959860.5342150.4628390.2690350.3135370.4473280.3985710.1140660.573150.8609410.7874530.4446190.06654620.7416170.7485830.6532750.804340.05234070.6672710.136507
950.6162570.05062920.5345450.2391080.603220.5109760.1516310.9999070.5539150.7291580.2931580.7024610.8471230.9388680.5963130.08968950.6242710.8670450.6824920.7995620.4554620.690710.3570470.6654680.346070.769710.747730.004526410.27910.2865060.3416840.2275640.8527810.4566280.04674010.7077250.1068510.5755610.8857740.06868840.6361530.1765660.2983760.4728460.05099240.4622250.2470090.4596380.6887740.9794040.7043770.9414350.4110970.05278510.3641360.1206460.8166060.1131910.9862310.4711070.00684180.04639050.7685870.5465520.4631420.2093980.06537770.06989890.6925130.4347890.9017010.7156730.6041410.09477980.03829460.06176790.5441030.484510.8355580.7800280.560170.301790.9286030.4758250.1513910.404220.7365390.5048180.6644330.1199970.3337220.8643680.6796390.4707820.9370580.7974950.3230930.6877640.6877710.364611
960.3958760.7409050.2268390.7778990.2893660.06561310.5923830.7775740.6381210.6387190.7400560.8222430.3859160.3819670.8790330.2642950.5686750.1504020.6992620.8117930.1667060.2183810.7929220.3821440.420510.8072790.6408310.5140150.1928530.8982110.8310770.7709890.6221980.3118380.8381430.1264210.5103860.6094350.4004610.3448730.2620070.4730540.9538380.001930180.4965710.1771890.3794290.8159730.7392080.3095780.7326530.92060.5389390.5093610.8641570.2353850.9402680.7806180.06564520.9163390.9498190.382170.8757070.9392970.3387240.7618550.1147120.19810.7687970.377010.9603690.2307380.09932190.5994440.7568910.367590.1020230.6797890.6911050.4278010.5955210.8358820.690390.03802350.1927560.04567810.8170810.2244670.1183360.2981090.2093670.4736590.4576290.6177280.1536660.02070310.1339680.4964920.194820.650693
970.5827030.5298260.8633260.219760.9328580.1934710.6746540.8742440.01553590.4551450.2087070.3585840.7057390.2983760.632990.4872880.838810.1458630.9657440.6798830.2135940.01297760.9108370.4334130.6087140.245570.2040220.2003670.7804450.7600740.1814350.6526850.7762310.2526390.6457870.4505620.308690.1491880.3507160.5516380.5393710.349270.1663360.5319450.6288790.210120.980310.542610.5390830.9127720.3762560.9418840.9401090.8946440.02150670.5949940.179880.1861610.3951230.9423850.3116960.5447160.3822190.8162050.7223290.659340.2400970.0597970.3513890.9625490.1810860.6979720.7828320.4950810.9501410.3532310.2512320.4892760.9830450.4762570.4348740.8235960.6732850.9685010.2185740.1606110.1334260.1097060.1288690.09485870.7949570.972660.2616320.2500580.6475950.3002240.2674810.3295920.7681230.463544
980.1730920.08694940.370130.8324270.5079070.07680520.615130.5876150.3971810.5846980.7768390.09657580.5903720.7241720.1617640.4569470.570250.1147610.2449470.7412580.9800410.6831980.8393350.8056330.07568360.8359050.2040980.02558660.5871920.5355570.7813110.004636870.3573060.115360.374020.5207350.1756340.8308130.7793880.273410.5414780.03315360.5752270.1375250.165670.8086780.08260260.2115790.4115640.417010.2539140.2382790.571350.2468720.268560.7678670.1725090.2273090.7993940.5050830.9908410.8728450.01940530.3340180.9636040.8555390.4596940.2525470.1603450.7781140.8788470.08775370.3927380.9482150.621720.4500210.9754640.4075820.2207620.2781750.5032910.05917550.009339320.1087140.2170740.9425870.3522710.3072280.3494120.9411980.4591510.7752790.3347270.8222390.6080650.9028180.1728870.4612090.6994570.95259
990.7473760.1639590.1641250.2267420.662880.8091440.1796860.8733940.7136720.6898540.4333820.2657980.3261820.0745120.1328170.5449490.7708430.4632210.6774250.9801820.1277070.6756880.4268470.3893230.6893160.4240240.9258750.02156720.1501930.8942820.4629160.9340060.182720.4662630.3557560.7722410.3101060.8988680.2835860.003132790.8507740.9143910.1823270.3148680.1468460.2825390.09716380.5532290.2500740.7929690.0007032620.6172870.1858170.6901220.4755540.2370570.8167380.7815410.4347090.520310.6892970.220190.9670840.08580950.6728920.9471090.103910.8168720.06187660.6784390.3802770.3814590.7949090.2243480.4793760.7069160.9043860.44380.8100980.1938840.4181580.3334350.4115090.9499460.5115360.3972540.9162530.261690.1059150.8152130.5025080.003864960.2968540.778530.5769060.6424240.6757640.3637440.5400820.866164
1000.6111420.4210280.5486780.6717240.06716580.1345990.05915070.202120.4781820.8397060.4141790.1916150.2715480.3185590.6910790.4493890.4779740.3188390.001459110.02359820.5460640.9430650.2400890.003480190.9375830.03069420.9940440.05891780.366610.2631050.4825140.5174840.711340.7756380.3017530.6143380.5780480.5649760.0910090.2893710.6886930.7576560.9648010.573490.1651970.4849050.8548380.1494330.7614450.3472150.7366360.1215390.9167020.1233510.1910820.1165810.1243130.3396210.1603060.9534260.3653030.4857680.02772450.9047290.08993090.4680730.1610460.7678380.3654570.3723120.397850.434530.357870.5744180.7468240.2491130.2418960.1956810.2590060.3349510.5608280.03836380.02990140.2473870.1076060.5524570.08988450.007264960.2921020.7368090.9552920.6748330.3793950.34540.9335120.5090990.4228770.2951090.4136150.129233

we can see that 92 of its columns were not printed. Also we get its first 30 rows. You can easily change this behavior by changing the value of ENV["LINES"] and ENV["COLUMNS"].

withenv("LINES" => 10, "COLUMNS" => 200) do
    show(df)
end
100×100 DataFrame
 Row  x1        x2         x3         x4         x5         x6         x7     ⋯
     │ Float64   Float64    Float64    Float64    Float64    Float64    Float6 ⋯
─────┼──────────────────────────────────────────────────────────────────────────
   1 │ 0.853871  0.798786   0.563023   0.335343   0.895971   0.154978   0.3119 ⋯
   2 │ 0.259409  0.587895   0.777552   0.289475   0.699264   0.296365   0.2252
   3 │ 0.936728  0.588592   0.0314927  0.110726   0.346444   0.634204   0.5544
   4 │ 0.545509  0.819688   0.660993   0.105134   0.900076   0.119826   0.7122
   5 │ 0.761057  0.335219   0.245305   0.439633   0.518318   0.469356   0.8492 ⋯
   6 │ 0.414058  0.0308165  0.382344   0.776271   0.66971    0.917266   0.4064
   7 │ 0.86916   0.697772   0.8548     0.847469   0.723711   0.892343   0.4626
   8 │ 0.53208   0.59186    0.464457   0.193331   0.122691   0.645427   0.0501
   9 │ 0.527354  0.69845    0.176888   0.968154   0.614275   0.825652   0.5737 ⋯
  10 │ 0.511376  0.741191   0.402993   0.158119   0.360876   0.270342   0.6461
  11 │ 0.689966  0.738278   0.413117   0.95356    0.0444413  0.778365   0.3989
  ⋮  │    ⋮          ⋮          ⋮          ⋮          ⋮          ⋮          ⋮  ⋱
  91 │ 0.684352  0.120026   0.318574   0.202233   0.780588   0.241173   0.4773
  92 │ 0.450417  0.319657   0.549945   0.60887    0.496068   0.120692   0.7330 ⋯
  93 │ 0.922841  0.997988   0.912199   0.147143   0.160719   0.936032   0.4538
  94 │ 0.264461  0.151572   0.213574   0.242099   0.225507   0.914375   0.7754
  95 │ 0.616257  0.0506292  0.534545   0.239108   0.60322    0.510976   0.1516
  96 │ 0.395876  0.740905   0.226839   0.777899   0.289366   0.0656131  0.5923 ⋯
  97 │ 0.582703  0.529826   0.863326   0.21976    0.932858   0.193471   0.6746
  98 │ 0.173092  0.0869494  0.37013    0.832427   0.507907   0.0768052  0.6151
  99 │ 0.747376  0.163959   0.164125   0.226742   0.66288    0.809144   0.1796
 100 │ 0.611142  0.421028   0.548678   0.671724   0.0671658  0.134599   0.0591 ⋯
                                                  94 columns and 79 rows omitted

Most elementary get and set operations#

Given the DataFrame x we have created earlier, here are various ways to grab one of its columns as a Vector.

x = DataFrame(A=[1, 2], B=[1.0, missing], C=["a", "b"])
2×3 DataFrame
RowABC
Int64Float64?String
111.0a
22missingb

all get the vector stored in our DataFrame without copying it

x.A, x[!, 1], x[!, :A]
([1, 2], [1, 2], [1, 2])

the same using string indexing

x."A", x[!, "A"]
([1, 2], [1, 2])

note that this creates a copy

x[:, 1]
2-element Vector{Int64}:
 1
 2
x[:, 1] === x[:, 1]
false

To grab one row as a DataFrame, we can index as follows.

x[1:1, :]
1×3 DataFrame
RowABC
Int64Float64?String
111.0a

this produces a DataFrameRow which is treated as 1-dimensional object similar to a NamedTuple

x[1, :] #
DataFrameRow (3 columns)
RowABC
Int64Float64?String
111.0a

We can grab a single cell or element with the same syntax to grab an element of an array.

x[1, 1]
1

or a new DataFrame that is a subset of rows and columns

x[1:2, 1:2]
2×2 DataFrame
RowAB
Int64Float64?
111.0
22missing

You can also use Regex to select columns and Not from InvertedIndices.jl both to select rows and columns

x[Not(1), r"A"]
1×1 DataFrame
RowA
Int64
12

! indicates that underlying columns are not copied

x[!, Not(1)]
2×2 DataFrame
RowBC
Float64?String
11.0a
2missingb

: means that the columns will get copied

x[:, Not(1)]
2×2 DataFrame
RowBC
Float64?String
11.0a
2missingb

Assignment of a scalar to a data frame can be done in ranges using broadcasting:

x[1:2, 1:2] .= 1
x
2×3 DataFrame
RowABC
Int64Float64?String
111.0a
211.0b

Assignment of a vector of length equal to the number of assigned rows using broadcasting

x[1:2, 1:2] .= [1, 2]
x
2×3 DataFrame
RowABC
Int64Float64?String
111.0a
222.0b

Assignment or of another data frame of matching size and column names, again using broadcasting:

x[1:2, 1:2] .= DataFrame([5 6; 7 8], [:A, :B])
x
2×3 DataFrame
RowABC
Int64Float64?String
156.0a
278.0b

Caution

With df[!, :col] and df.col syntax you get a direct (non copying) access to a column of a data frame. This is potentially unsafe as you can easily corrupt data in the df data frame if you resize, sort, etc. the column obtained in this way. Therefore such access should be used with caution.

Similarly df[!, cols] when cols is a collection of columns produces a new data frame that holds the same (not copied) columns as the source df data frame. Similarly, modifying the data frame obtained via df[!, cols] might cause problems with the consistency of df.

The df[:, :col] and df[:, cols] syntaxes always copy columns so they are safe to use (and should generally be preferred except for performance or memory critical use cases).

Here are examples of how Cols and Between can be used to select columns of a data frame.

x = DataFrame(rand(4, 5), :auto)
4×5 DataFrame
Rowx1x2x3x4x5
Float64Float64Float64Float64Float64
10.6424210.7796650.2445970.8988460.0526722
20.1291420.9542670.8129810.2081210.446236
30.534730.1842690.6016340.5747630.87954
40.6700970.7339460.8414050.5785260.716134
x[:, Between(:x2, :x4)]
4×3 DataFrame
Rowx2x3x4
Float64Float64Float64
10.7796650.2445970.898846
20.9542670.8129810.208121
30.1842690.6016340.574763
40.7339460.8414050.578526
x[:, Cols("x1", Between("x2", "x4"))]
4×4 DataFrame
Rowx1x2x3x4
Float64Float64Float64Float64
10.6424210.7796650.2445970.898846
20.1291420.9542670.8129810.208121
30.534730.1842690.6016340.574763
40.6700970.7339460.8414050.578526

Views#

You can simply create a view of a DataFrame (it is more efficient than creating a materialized selection). Here are the possible return value options.

@view x[1:2, 1]
2-element view(::Vector{Float64}, 1:2) with eltype Float64:
 0.6424209365095274
 0.12914205409221835
@view x[1, 1]
0-dimensional view(::Vector{Float64}, 1) with eltype Float64:
0.6424209365095274

a DataFrameRow, the same as for x[1, 1:2] without a view

@view x[1, 1:2]
DataFrameRow (2 columns)
Rowx1x2
Float64Float64
10.6424210.779665

a SubDataFrame

@view x[1:2, 1:2]
2×2 SubDataFrame
Rowx1x2
Float64Float64
10.6424210.779665
20.1291420.954267

Adding new columns to a data frame#

df = DataFrame()
0×0 DataFrame

using setproperty! (element assignment)

x = [1, 2, 3]
df.a = x
df
3×1 DataFrame
Rowa
Int64
11
22
33

no copy is performed

df.a === x
true

using setindex!

df[!, :b] = x
df[:, :c] = x
df
3×3 DataFrame
Rowabc
Int64Int64Int64
1111
2222
3333

no copy

df.b === x
true

copy (! and : has different effects)

df.c === x
false

Element-wise assignment

df[!, :d] .= x
df[:, :e] .= x
df
3×5 DataFrame
Rowabcde
Int64Int64Int64Int64Int64
111111
222222
333333

both copy, so in this case ! and : has the same effect

df.d === x, df.e === x
(false, false)

note that in our data frame columns :a and :b store the vector x (not a copy)

df.a === df.b === x
true

This can lead to silent errors. For example this code leads to a bug (note that calling pairs on eachcol(df) creates an iterator of (column name, column) pairs):

try
    for (n, c) in pairs(eachcol(df))
        println("$n: ", pop!(c))
    end
catch e
    show(e)
end
a: 3
b: 2
c: 3
d: 3
e: 3

note that for column :b we printed 2 as 3 was removed from it when we used pop! on column :a. Such mistakes sometimes happen. Because of this DataFrames.jl performs consistency checks before doing an expensive operation (most notably before showing a data frame).

try
    show(df)
catch e
    show(e)
end
AssertionError("Data frame is corrupt: length of column :c (2) does not match length of column 1 (1). The column vector has likely been resized unintentionally (either directly or because it is shared with another data frame).")

We can investigate the columns to find out what happend:

collect(pairs(eachcol(df)))
5-element Vector{Pair{Symbol, AbstractVector}}:
 :a => [1]
 :b => [1]
 :c => [1, 2]
 :d => [1, 2]
 :e => [1, 2]

The output confirms that the data frame df got corrupted. DataFrames.jl supports a complete set of getindex, getproperty, setindex!, setproperty!, view, broadcasting, and broadcasting assignment operations. The details are explained here: http://juliadata.github.io/DataFrames.jl/latest/lib/indexing/.

Comparisons#

using DataFrames
df = DataFrame(rand(2, 3), :auto)
2×3 DataFrame
Rowx1x2x3
Float64Float64Float64
10.1891870.629260.339583
20.7453250.5761590.572542
df2 = copy(df)
2×3 DataFrame
Rowx1x2x3
Float64Float64Float64
10.1891870.629260.339583
20.7453250.5761590.572542

compares column names and contents

df == df2
true

create a minimally different data frame and use isapprox for comparison

df3 = df2 .+ eps()
2×3 DataFrame
Rowx1x2x3
Float64Float64Float64
10.1891870.629260.339583
20.7453250.5761590.572542
df == df3
false
isapprox(df, df3)
true
isapprox(df, df3, atol=eps() / 2)
false

missings are handled as in Julia Base

df = DataFrame(a=missing)
1×1 DataFrame
Rowa
Missing
1missing

Same value?

df == df
missing

Same object?

df === df
true
isequal(df, df)
true