== Physical Plan == CollectLimit (20) +- InMemoryTableScan (1) +- InMemoryRelation (2) +- * Sort (19) +- Exchange (18) +- * Project (17) +- * BroadcastHashJoin Inner BuildRight (16) :- * Project (9) : +- * Filter (8) : +- * ColumnarToRow (7) : +- InMemoryTableScan (3) : +- InMemoryRelation (4) : +- * Project (6) : +- Scan csv (5) +- BroadcastExchange (15) +- * Filter (14) +- InMemoryTableScan (10) +- InMemoryRelation (11) +- * Project (13) +- Scan csv (12) (1) InMemoryTableScan Output [7]: [sector_id#94125513, numcos#94125528, numdates#94125529, sort#93880530, description#93880532, universe#94125754, coverage#94125694] Arguments: [sector_id#94125513, numcos#94125528, numdates#94125529, sort#93880530, description#93880532, universe#94125754, coverage#94125694] (2) InMemoryRelation Arguments: [sector_id#94125513, numcos#94125528, numdates#94125529, sort#93880530, description#93880532, universe#94125754, coverage#94125694], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(3) Sort [sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST], true, 0 +- Exchange rangepartitioning(sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7515156] +- *(2) Project [sector_id#94125513, numcos#94125528, numdates#94125529, sort#93880530, description#93880532, universe#94125754, coverage#94125694] +- *(2) BroadcastHashJoin [sector_id#94125513], [sector_id#93880529], Inner, BuildRight, false :- *(2) Project [sector_id#94125513, numcos#94125528, numdates#94125529, coverage#94125694, round((cast(numcos#94125528 as double) / cast(coverage#94125694 as double)), 0) AS universe#94125754] : +- *(2) Filter isnotnull(sector_id#94125513) : +- *(2) ColumnarToRow : +- InMemoryTableScan [coverage#94125694, numcos#94125528, numdates#94125529, sector_id#94125513], [isnotnull(sector_id#94125513)] : +- InMemoryRelation [sector_id#94125513, retIC#94125514, resretIC#94125515, numcos#94125528, numdates#94125529, annual_bmret#94125530, annual_ret#94125531, std_ret#94125543, Sharpe_ret#94125653, PctPos_ret#94125655, TR_ret#94125669, IR_ret#94125673, annual_resret#94125675, std_resret#94125677, Sharpe_resret#94125679, PctPos_resret#94125681, TR_resret#94125683, IR_resret#94125685, annual_retnet#94125687, std_retnet#94125688, Sharpe_retnet#94125689, PctPos_retnet#94125690, TR_retnet#94125691, IR_retnet#94125692, ... 2 more fields], StorageLevel(disk, memory, deserialized, 1 replicas) : +- *(1) Project [CASE WHEN ((sector_id#94125150 = NA) OR (sector_id#94125150 = null)) THEN null ELSE cast(sector_id#94125150 as int) END AS sector_id#94125513, CASE WHEN ((retIC#94125151 = NA) OR (retIC#94125151 = null)) THEN null ELSE cast(retIC#94125151 as float) END AS retIC#94125514, CASE WHEN ((resretIC#94125152 = NA) OR (resretIC#94125152 = null)) THEN null ELSE cast(resretIC#94125152 as float) END AS resretIC#94125515, CASE WHEN ((numcos#94125153 = NA) OR (numcos#94125153 = null)) THEN null ELSE cast(numcos#94125153 as float) END AS numcos#94125528, CASE WHEN ((numdates#94125154 = NA) OR (numdates#94125154 = null)) THEN null ELSE cast(numdates#94125154 as int) END AS numdates#94125529, CASE WHEN ((annual_bmret#94125155 = NA) OR (annual_bmret#94125155 = null)) THEN null ELSE cast(annual_bmret#94125155 as float) END AS annual_bmret#94125530, CASE WHEN ((annual_ret#94125156 = NA) OR (annual_ret#94125156 = null)) THEN null ELSE cast(annual_ret#94125156 as float) END AS annual_ret#94125531, CASE WHEN ((std_ret#94125157 = NA) OR (std_ret#94125157 = null)) THEN null ELSE cast(std_ret#94125157 as float) END AS std_ret#94125543, CASE WHEN ((Sharpe_ret#94125158 = NA) OR (Sharpe_ret#94125158 = null)) THEN null ELSE cast(Sharpe_ret#94125158 as float) END AS Sharpe_ret#94125653, CASE WHEN ((PctPos_ret#94125159 = NA) OR (PctPos_ret#94125159 = null)) THEN null ELSE cast(PctPos_ret#94125159 as float) END AS PctPos_ret#94125655, CASE WHEN ((TR_ret#94125160 = NA) OR (TR_ret#94125160 = null)) THEN null ELSE cast(TR_ret#94125160 as float) END AS TR_ret#94125669, CASE WHEN ((IR_ret#94125161 = NA) OR (IR_ret#94125161 = null)) THEN null ELSE cast(IR_ret#94125161 as float) END AS IR_ret#94125673, CASE WHEN ((annual_resret#94125162 = NA) OR (annual_resret#94125162 = null)) THEN null ELSE cast(annual_resret#94125162 as float) END AS annual_resret#94125675, CASE WHEN ((std_resret#94125163 = NA) OR (std_resret#94125163 = null)) THEN null ELSE cast(std_resret#94125163 as float) END AS std_resret#94125677, CASE WHEN ((Sharpe_resret#94125164 = NA) OR (Sharpe_resret#94125164 = null)) THEN null ELSE cast(Sharpe_resret#94125164 as float) END AS Sharpe_resret#94125679, CASE WHEN ((PctPos_resret#94125165 = NA) OR (PctPos_resret#94125165 = null)) THEN null ELSE cast(PctPos_resret#94125165 as float) END AS PctPos_resret#94125681, CASE WHEN ((TR_resret#94125166 = NA) OR (TR_resret#94125166 = null)) THEN null ELSE cast(TR_resret#94125166 as float) END AS TR_resret#94125683, CASE WHEN ((IR_resret#94125167 = NA) OR (IR_resret#94125167 = null)) THEN null ELSE cast(IR_resret#94125167 as float) END AS IR_resret#94125685, CASE WHEN ((annual_retnet#94125168 = NA) OR (annual_retnet#94125168 = null)) THEN null ELSE cast(annual_retnet#94125168 as float) END AS annual_retnet#94125687, CASE WHEN ((std_retnet#94125169 = NA) OR (std_retnet#94125169 = null)) THEN null ELSE cast(std_retnet#94125169 as float) END AS std_retnet#94125688, CASE WHEN ((Sharpe_retnet#94125170 = NA) OR (Sharpe_retnet#94125170 = null)) THEN null ELSE cast(Sharpe_retnet#94125170 as float) END AS Sharpe_retnet#94125689, CASE WHEN ((PctPos_retnet#94125171 = NA) OR (PctPos_retnet#94125171 = null)) THEN null ELSE cast(PctPos_retnet#94125171 as float) END AS PctPos_retnet#94125690, CASE WHEN ((TR_retnet#94125172 = NA) OR (TR_retnet#94125172 = null)) THEN null ELSE cast(TR_retnet#94125172 as float) END AS TR_retnet#94125691, CASE WHEN ((IR_retnet#94125173 = NA) OR (IR_retnet#94125173 = null)) THEN null ELSE cast(IR_retnet#94125173 as float) END AS IR_retnet#94125692, ... 2 more fields] : +- FileScan csv [sector_id#94125150,retIC#94125151,resretIC#94125152,numcos#94125153,numdates#94125154,annual_bmret#94125155,annual_ret#94125156,std_ret#94125157,Sharpe_ret#94125158,PctPos_ret#94125159,TR_ret#94125160,IR_ret#94125161,annual_resret#94125162,std_resret#94125163,Sharpe_resret#94125164,PctPos_resret#94125165,TR_resret#94125166,IR_resret#94125167,annual_retnet#94125168,std_retnet#94125169,Sharpe_retnet#94125170,PctPos_retnet#94125171,TR_retnet#94125172,IR_retnet#94125173,... 2 more fields] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/transcripts/transcript..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:s... +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7515151] +- *(1) Filter isnotnull(sector_id#93880529) +- InMemoryTableScan [sector_id#93880529, sort#93880530, description#93880532], [isnotnull(sector_id#93880529)] +- InMemoryRelation [sector_id#93880529, sort#93880530, description#93880532, universe#93880534], StorageLevel(disk, memory, deserialized, 1 replicas) +- *(1) Project [CASE WHEN ((sector_id#93880497 = NA) OR (sector_id#93880497 = null)) THEN null ELSE cast(sector_id#93880497 as int) END AS sector_id#93880529, CASE WHEN (sort#93880499 = null) THEN null ELSE sort#93880499 END AS sort#93880530, CASE WHEN (description#93880501 = null) THEN null ELSE description#93880501 END AS description#93880532, CASE WHEN ((universe#93880503 = NA) OR (universe#93880503 = null)) THEN null ELSE cast(universe#93880503 as int) END AS universe#93880534] +- FileScan csv [sector_id#93880497,sort#93880499,description#93880501,universe#93880503] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> ,None), [sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST] (3) InMemoryTableScan Output [4]: [coverage#94125694, numcos#94125528, numdates#94125529, sector_id#94125513] Arguments: [coverage#94125694, numcos#94125528, numdates#94125529, sector_id#94125513], [isnotnull(sector_id#94125513)] (4) InMemoryRelation Arguments: [sector_id#94125513, retIC#94125514, resretIC#94125515, numcos#94125528, numdates#94125529, annual_bmret#94125530, annual_ret#94125531, std_ret#94125543, Sharpe_ret#94125653, PctPos_ret#94125655, TR_ret#94125669, IR_ret#94125673, annual_resret#94125675, std_resret#94125677, Sharpe_resret#94125679, PctPos_resret#94125681, TR_resret#94125683, IR_resret#94125685, annual_retnet#94125687, std_retnet#94125688, Sharpe_retnet#94125689, PctPos_retnet#94125690, TR_retnet#94125691, IR_retnet#94125692, ... 2 more fields], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((sector_id#94125150 = NA) OR (sector_id#94125150 = null)) THEN null ELSE cast(sector_id#94125150 as int) END AS sector_id#94125513, CASE WHEN ((retIC#94125151 = NA) OR (retIC#94125151 = null)) THEN null ELSE cast(retIC#94125151 as float) END AS retIC#94125514, CASE WHEN ((resretIC#94125152 = NA) OR (resretIC#94125152 = null)) THEN null ELSE cast(resretIC#94125152 as float) END AS resretIC#94125515, CASE WHEN ((numcos#94125153 = NA) OR (numcos#94125153 = null)) THEN null ELSE cast(numcos#94125153 as float) END AS numcos#94125528, CASE WHEN ((numdates#94125154 = NA) OR (numdates#94125154 = null)) THEN null ELSE cast(numdates#94125154 as int) END AS numdates#94125529, CASE WHEN ((annual_bmret#94125155 = NA) OR (annual_bmret#94125155 = null)) THEN null ELSE cast(annual_bmret#94125155 as float) END AS annual_bmret#94125530, CASE WHEN ((annual_ret#94125156 = NA) OR (annual_ret#94125156 = null)) THEN null ELSE cast(annual_ret#94125156 as float) END AS annual_ret#94125531, CASE WHEN ((std_ret#94125157 = NA) OR (std_ret#94125157 = null)) THEN null ELSE cast(std_ret#94125157 as float) END AS std_ret#94125543, CASE WHEN ((Sharpe_ret#94125158 = NA) OR (Sharpe_ret#94125158 = null)) THEN null ELSE cast(Sharpe_ret#94125158 as float) END AS Sharpe_ret#94125653, CASE WHEN ((PctPos_ret#94125159 = NA) OR (PctPos_ret#94125159 = null)) THEN null ELSE cast(PctPos_ret#94125159 as float) END AS PctPos_ret#94125655, CASE WHEN ((TR_ret#94125160 = NA) OR (TR_ret#94125160 = null)) THEN null ELSE cast(TR_ret#94125160 as float) END AS TR_ret#94125669, CASE WHEN ((IR_ret#94125161 = NA) OR (IR_ret#94125161 = null)) THEN null ELSE cast(IR_ret#94125161 as float) END AS IR_ret#94125673, CASE WHEN ((annual_resret#94125162 = NA) OR (annual_resret#94125162 = null)) THEN null ELSE cast(annual_resret#94125162 as float) END AS annual_resret#94125675, CASE WHEN ((std_resret#94125163 = NA) OR (std_resret#94125163 = null)) THEN null ELSE cast(std_resret#94125163 as float) END AS std_resret#94125677, CASE WHEN ((Sharpe_resret#94125164 = NA) OR (Sharpe_resret#94125164 = null)) THEN null ELSE cast(Sharpe_resret#94125164 as float) END AS Sharpe_resret#94125679, CASE WHEN ((PctPos_resret#94125165 = NA) OR (PctPos_resret#94125165 = null)) THEN null ELSE cast(PctPos_resret#94125165 as float) END AS PctPos_resret#94125681, CASE WHEN ((TR_resret#94125166 = NA) OR (TR_resret#94125166 = null)) THEN null ELSE cast(TR_resret#94125166 as float) END AS TR_resret#94125683, CASE WHEN ((IR_resret#94125167 = NA) OR (IR_resret#94125167 = null)) THEN null ELSE cast(IR_resret#94125167 as float) END AS IR_resret#94125685, CASE WHEN ((annual_retnet#94125168 = NA) OR (annual_retnet#94125168 = null)) THEN null ELSE cast(annual_retnet#94125168 as float) END AS annual_retnet#94125687, CASE WHEN ((std_retnet#94125169 = NA) OR (std_retnet#94125169 = null)) THEN null ELSE cast(std_retnet#94125169 as float) END AS std_retnet#94125688, CASE WHEN ((Sharpe_retnet#94125170 = NA) OR (Sharpe_retnet#94125170 = null)) THEN null ELSE cast(Sharpe_retnet#94125170 as float) END AS Sharpe_retnet#94125689, CASE WHEN ((PctPos_retnet#94125171 = NA) OR (PctPos_retnet#94125171 = null)) THEN null ELSE cast(PctPos_retnet#94125171 as float) END AS PctPos_retnet#94125690, CASE WHEN ((TR_retnet#94125172 = NA) OR (TR_retnet#94125172 = null)) THEN null ELSE cast(TR_retnet#94125172 as float) END AS TR_retnet#94125691, CASE WHEN ((IR_retnet#94125173 = NA) OR (IR_retnet#94125173 = null)) THEN null ELSE cast(IR_retnet#94125173 as float) END AS IR_retnet#94125692, ... 2 more fields] +- FileScan csv [sector_id#94125150,retIC#94125151,resretIC#94125152,numcos#94125153,numdates#94125154,annual_bmret#94125155,annual_ret#94125156,std_ret#94125157,Sharpe_ret#94125158,PctPos_ret#94125159,TR_ret#94125160,IR_ret#94125161,annual_resret#94125162,std_resret#94125163,Sharpe_resret#94125164,PctPos_resret#94125165,TR_resret#94125166,IR_resret#94125167,annual_retnet#94125168,std_retnet#94125169,Sharpe_retnet#94125170,PctPos_retnet#94125171,TR_retnet#94125172,IR_retnet#94125173,... 2 more fields] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/transcripts/transcript..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:s... ,None) (5) Scan csv Output [26]: [sector_id#94125150, retIC#94125151, resretIC#94125152, numcos#94125153, numdates#94125154, annual_bmret#94125155, annual_ret#94125156, std_ret#94125157, Sharpe_ret#94125158, PctPos_ret#94125159, TR_ret#94125160, IR_ret#94125161, annual_resret#94125162, std_resret#94125163, Sharpe_resret#94125164, PctPos_resret#94125165, TR_resret#94125166, IR_resret#94125167, annual_retnet#94125168, std_retnet#94125169, Sharpe_retnet#94125170, PctPos_retnet#94125171, TR_retnet#94125172, IR_retnet#94125173, turnover#94125174, coverage#94125175] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/output/transcripts/transcript_model_residualized/stats_sector_id.csv] ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:string,annual_ret:string,std_ret:string,Sharpe_ret:string,PctPos_ret:string,TR_ret:string,IR_ret:string,annual_resret:string,std_resret:string,Sharpe_resret:string,PctPos_resret:string,TR_resret:string,IR_resret:string,annual_retnet:string,std_retnet:string,Sharpe_retnet:string,PctPos_retnet:string,TR_retnet:string,IR_retnet:string,turnover:string,coverage:string> (6) Project [codegen id : 1] Output [26]: [CASE WHEN ((sector_id#94125150 = NA) OR (sector_id#94125150 = null)) THEN null ELSE cast(sector_id#94125150 as int) END AS sector_id#94125513, CASE WHEN ((retIC#94125151 = NA) OR (retIC#94125151 = null)) THEN null ELSE cast(retIC#94125151 as float) END AS retIC#94125514, CASE WHEN ((resretIC#94125152 = NA) OR (resretIC#94125152 = null)) THEN null ELSE cast(resretIC#94125152 as float) END AS resretIC#94125515, CASE WHEN ((numcos#94125153 = NA) OR (numcos#94125153 = null)) THEN null ELSE cast(numcos#94125153 as float) END AS numcos#94125528, CASE WHEN ((numdates#94125154 = NA) OR (numdates#94125154 = null)) THEN null ELSE cast(numdates#94125154 as int) END AS numdates#94125529, CASE WHEN ((annual_bmret#94125155 = NA) OR (annual_bmret#94125155 = null)) THEN null ELSE cast(annual_bmret#94125155 as float) END AS annual_bmret#94125530, CASE WHEN ((annual_ret#94125156 = NA) OR (annual_ret#94125156 = null)) THEN null ELSE cast(annual_ret#94125156 as float) END AS annual_ret#94125531, CASE WHEN ((std_ret#94125157 = NA) OR (std_ret#94125157 = null)) THEN null ELSE cast(std_ret#94125157 as float) END AS std_ret#94125543, CASE WHEN ((Sharpe_ret#94125158 = NA) OR (Sharpe_ret#94125158 = null)) THEN null ELSE cast(Sharpe_ret#94125158 as float) END AS Sharpe_ret#94125653, CASE WHEN ((PctPos_ret#94125159 = NA) OR (PctPos_ret#94125159 = null)) THEN null ELSE cast(PctPos_ret#94125159 as float) END AS PctPos_ret#94125655, CASE WHEN ((TR_ret#94125160 = NA) OR (TR_ret#94125160 = null)) THEN null ELSE cast(TR_ret#94125160 as float) END AS TR_ret#94125669, CASE WHEN ((IR_ret#94125161 = NA) OR (IR_ret#94125161 = null)) THEN null ELSE cast(IR_ret#94125161 as float) END AS IR_ret#94125673, CASE WHEN ((annual_resret#94125162 = NA) OR (annual_resret#94125162 = null)) THEN null ELSE cast(annual_resret#94125162 as float) END AS annual_resret#94125675, CASE WHEN ((std_resret#94125163 = NA) OR (std_resret#94125163 = null)) THEN null ELSE cast(std_resret#94125163 as float) END AS std_resret#94125677, CASE WHEN ((Sharpe_resret#94125164 = NA) OR (Sharpe_resret#94125164 = null)) THEN null ELSE cast(Sharpe_resret#94125164 as float) END AS Sharpe_resret#94125679, CASE WHEN ((PctPos_resret#94125165 = NA) OR (PctPos_resret#94125165 = null)) THEN null ELSE cast(PctPos_resret#94125165 as float) END AS PctPos_resret#94125681, CASE WHEN ((TR_resret#94125166 = NA) OR (TR_resret#94125166 = null)) THEN null ELSE cast(TR_resret#94125166 as float) END AS TR_resret#94125683, CASE WHEN ((IR_resret#94125167 = NA) OR (IR_resret#94125167 = null)) THEN null ELSE cast(IR_resret#94125167 as float) END AS IR_resret#94125685, CASE WHEN ((annual_retnet#94125168 = NA) OR (annual_retnet#94125168 = null)) THEN null ELSE cast(annual_retnet#94125168 as float) END AS annual_retnet#94125687, CASE WHEN ((std_retnet#94125169 = NA) OR (std_retnet#94125169 = null)) THEN null ELSE cast(std_retnet#94125169 as float) END AS std_retnet#94125688, CASE WHEN ((Sharpe_retnet#94125170 = NA) OR (Sharpe_retnet#94125170 = null)) THEN null ELSE cast(Sharpe_retnet#94125170 as float) END AS Sharpe_retnet#94125689, CASE WHEN ((PctPos_retnet#94125171 = NA) OR (PctPos_retnet#94125171 = null)) THEN null ELSE cast(PctPos_retnet#94125171 as float) END AS PctPos_retnet#94125690, CASE WHEN ((TR_retnet#94125172 = NA) OR (TR_retnet#94125172 = null)) THEN null ELSE cast(TR_retnet#94125172 as float) END AS TR_retnet#94125691, CASE WHEN ((IR_retnet#94125173 = NA) OR (IR_retnet#94125173 = null)) THEN null ELSE cast(IR_retnet#94125173 as float) END AS IR_retnet#94125692, CASE WHEN ((turnover#94125174 = NA) OR (turnover#94125174 = null)) THEN null ELSE cast(turnover#94125174 as float) END AS turnover#94125693, CASE WHEN ((coverage#94125175 = NA) OR (coverage#94125175 = null)) THEN null ELSE cast(coverage#94125175 as float) END AS coverage#94125694] Input [26]: [sector_id#94125150, retIC#94125151, resretIC#94125152, numcos#94125153, numdates#94125154, annual_bmret#94125155, annual_ret#94125156, std_ret#94125157, Sharpe_ret#94125158, PctPos_ret#94125159, TR_ret#94125160, IR_ret#94125161, annual_resret#94125162, std_resret#94125163, Sharpe_resret#94125164, PctPos_resret#94125165, TR_resret#94125166, IR_resret#94125167, annual_retnet#94125168, std_retnet#94125169, Sharpe_retnet#94125170, PctPos_retnet#94125171, TR_retnet#94125172, IR_retnet#94125173, turnover#94125174, coverage#94125175] (7) ColumnarToRow [codegen id : 2] Input [4]: [coverage#94125694, numcos#94125528, numdates#94125529, sector_id#94125513] (8) Filter [codegen id : 2] Input [4]: [coverage#94125694, numcos#94125528, numdates#94125529, sector_id#94125513] Condition : isnotnull(sector_id#94125513) (9) Project [codegen id : 2] Output [5]: [sector_id#94125513, numcos#94125528, numdates#94125529, coverage#94125694, round((cast(numcos#94125528 as double) / cast(coverage#94125694 as double)), 0) AS universe#94125754] Input [4]: [coverage#94125694, numcos#94125528, numdates#94125529, sector_id#94125513] (10) InMemoryTableScan Output [3]: [sector_id#93880529, sort#93880530, description#93880532] Arguments: [sector_id#93880529, sort#93880530, description#93880532], [isnotnull(sector_id#93880529)] (11) InMemoryRelation Arguments: [sector_id#93880529, sort#93880530, description#93880532, universe#93880534], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((sector_id#93880497 = NA) OR (sector_id#93880497 = null)) THEN null ELSE cast(sector_id#93880497 as int) END AS sector_id#93880529, CASE WHEN (sort#93880499 = null) THEN null ELSE sort#93880499 END AS sort#93880530, CASE WHEN (description#93880501 = null) THEN null ELSE description#93880501 END AS description#93880532, CASE WHEN ((universe#93880503 = NA) OR (universe#93880503 = null)) THEN null ELSE cast(universe#93880503 as int) END AS universe#93880534] +- FileScan csv [sector_id#93880497,sort#93880499,description#93880501,universe#93880503] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> ,None) (12) Scan csv Output [4]: [sector_id#93880497, sort#93880499, description#93880501, universe#93880503] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv] ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> (13) Project [codegen id : 1] Output [4]: [CASE WHEN ((sector_id#93880497 = NA) OR (sector_id#93880497 = null)) THEN null ELSE cast(sector_id#93880497 as int) END AS sector_id#93880529, CASE WHEN (sort#93880499 = null) THEN null ELSE sort#93880499 END AS sort#93880530, CASE WHEN (description#93880501 = null) THEN null ELSE description#93880501 END AS description#93880532, CASE WHEN ((universe#93880503 = NA) OR (universe#93880503 = null)) THEN null ELSE cast(universe#93880503 as int) END AS universe#93880534] Input [4]: [sector_id#93880497, sort#93880499, description#93880501, universe#93880503] (14) Filter [codegen id : 1] Input [3]: [sector_id#93880529, sort#93880530, description#93880532] Condition : isnotnull(sector_id#93880529) (15) BroadcastExchange Input [3]: [sector_id#93880529, sort#93880530, description#93880532] Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7515151] (16) BroadcastHashJoin [codegen id : 2] Left keys [1]: [sector_id#94125513] Right keys [1]: [sector_id#93880529] Join condition: None (17) Project [codegen id : 2] Output [7]: [sector_id#94125513, numcos#94125528, numdates#94125529, sort#93880530, description#93880532, universe#94125754, coverage#94125694] Input [8]: [sector_id#94125513, numcos#94125528, numdates#94125529, coverage#94125694, universe#94125754, sector_id#93880529, sort#93880530, description#93880532] (18) Exchange Input [7]: [sector_id#94125513, numcos#94125528, numdates#94125529, sort#93880530, description#93880532, universe#94125754, coverage#94125694] Arguments: rangepartitioning(sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7515156] (19) Sort [codegen id : 3] Input [7]: [sector_id#94125513, numcos#94125528, numdates#94125529, sort#93880530, description#93880532, universe#94125754, coverage#94125694] Arguments: [sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST], true, 0 (20) CollectLimit Input [7]: [sector_id#94125513, numcos#94125528, numdates#94125529, sort#93880530, description#93880532, universe#94125754, coverage#94125694] Arguments: 1000000