== Physical Plan == CollectLimit (20) +- InMemoryTableScan (1) +- InMemoryRelation (2) +- * Sort (19) +- Exchange (18) +- * Project (17) +- * BroadcastHashJoin Inner BuildRight (16) :- * Project (9) : +- * Filter (8) : +- * ColumnarToRow (7) : +- InMemoryTableScan (3) : +- InMemoryRelation (4) : +- * Project (6) : +- Scan csv (5) +- BroadcastExchange (15) +- * Filter (14) +- InMemoryTableScan (10) +- InMemoryRelation (11) +- * Project (13) +- Scan csv (12) (1) InMemoryTableScan Output [7]: [sector_id#94110582, numcos#94110587, numdates#94110588, sort#93880530, description#93880532, universe#94110747, coverage#94110678] Arguments: [sector_id#94110582, numcos#94110587, numdates#94110588, sort#93880530, description#93880532, universe#94110747, coverage#94110678] (2) InMemoryRelation Arguments: [sector_id#94110582, numcos#94110587, numdates#94110588, sort#93880530, description#93880532, universe#94110747, coverage#94110678], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(3) Sort [sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST], true, 0 +- Exchange rangepartitioning(sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7513996] +- *(2) Project [sector_id#94110582, numcos#94110587, numdates#94110588, sort#93880530, description#93880532, universe#94110747, coverage#94110678] +- *(2) BroadcastHashJoin [sector_id#94110582], [sector_id#93880529], Inner, BuildRight, false :- *(2) Project [sector_id#94110582, numcos#94110587, numdates#94110588, coverage#94110678, round((cast(numcos#94110587 as double) / cast(coverage#94110678 as double)), 0) AS universe#94110747] : +- *(2) Filter isnotnull(sector_id#94110582) : +- *(2) ColumnarToRow : +- InMemoryTableScan [coverage#94110678, numcos#94110587, numdates#94110588, sector_id#94110582], [isnotnull(sector_id#94110582)] : +- InMemoryRelation [sector_id#94110582, retIC#94110583, resretIC#94110586, numcos#94110587, numdates#94110588, annual_bmret#94110589, annual_ret#94110616, std_ret#94110617, Sharpe_ret#94110618, PctPos_ret#94110645, TR_ret#94110646, IR_ret#94110647, annual_resret#94110649, std_resret#94110651, Sharpe_resret#94110654, PctPos_resret#94110656, TR_resret#94110658, IR_resret#94110660, annual_retnet#94110662, std_retnet#94110665, Sharpe_retnet#94110667, PctPos_retnet#94110669, TR_retnet#94110672, IR_retnet#94110674, ... 2 more fields], StorageLevel(disk, memory, deserialized, 1 replicas) : +- *(1) Project [CASE WHEN ((sector_id#94110097 = NA) OR (sector_id#94110097 = null)) THEN null ELSE cast(sector_id#94110097 as int) END AS sector_id#94110582, CASE WHEN ((retIC#94110098 = NA) OR (retIC#94110098 = null)) THEN null ELSE cast(retIC#94110098 as float) END AS retIC#94110583, CASE WHEN ((resretIC#94110099 = NA) OR (resretIC#94110099 = null)) THEN null ELSE cast(resretIC#94110099 as float) END AS resretIC#94110586, CASE WHEN ((numcos#94110100 = NA) OR (numcos#94110100 = null)) THEN null ELSE cast(numcos#94110100 as float) END AS numcos#94110587, CASE WHEN ((numdates#94110101 = NA) OR (numdates#94110101 = null)) THEN null ELSE cast(numdates#94110101 as int) END AS numdates#94110588, CASE WHEN ((annual_bmret#94110102 = NA) OR (annual_bmret#94110102 = null)) THEN null ELSE cast(annual_bmret#94110102 as float) END AS annual_bmret#94110589, CASE WHEN ((annual_ret#94110103 = NA) OR (annual_ret#94110103 = null)) THEN null ELSE cast(annual_ret#94110103 as float) END AS annual_ret#94110616, CASE WHEN ((std_ret#94110104 = NA) OR (std_ret#94110104 = null)) THEN null ELSE cast(std_ret#94110104 as float) END AS std_ret#94110617, CASE WHEN ((Sharpe_ret#94110105 = NA) OR (Sharpe_ret#94110105 = null)) THEN null ELSE cast(Sharpe_ret#94110105 as float) END AS Sharpe_ret#94110618, CASE WHEN ((PctPos_ret#94110106 = NA) OR (PctPos_ret#94110106 = null)) THEN null ELSE cast(PctPos_ret#94110106 as float) END AS PctPos_ret#94110645, CASE WHEN ((TR_ret#94110107 = NA) OR (TR_ret#94110107 = null)) THEN null ELSE cast(TR_ret#94110107 as float) END AS TR_ret#94110646, CASE WHEN ((IR_ret#94110108 = NA) OR (IR_ret#94110108 = null)) THEN null ELSE cast(IR_ret#94110108 as float) END AS IR_ret#94110647, CASE WHEN ((annual_resret#94110109 = NA) OR (annual_resret#94110109 = null)) THEN null ELSE cast(annual_resret#94110109 as float) END AS annual_resret#94110649, CASE WHEN ((std_resret#94110110 = NA) OR (std_resret#94110110 = null)) THEN null ELSE cast(std_resret#94110110 as float) END AS std_resret#94110651, CASE WHEN ((Sharpe_resret#94110111 = NA) OR (Sharpe_resret#94110111 = null)) THEN null ELSE cast(Sharpe_resret#94110111 as float) END AS Sharpe_resret#94110654, CASE WHEN ((PctPos_resret#94110112 = NA) OR (PctPos_resret#94110112 = null)) THEN null ELSE cast(PctPos_resret#94110112 as float) END AS PctPos_resret#94110656, CASE WHEN ((TR_resret#94110113 = NA) OR (TR_resret#94110113 = null)) THEN null ELSE cast(TR_resret#94110113 as float) END AS TR_resret#94110658, CASE WHEN ((IR_resret#94110114 = NA) OR (IR_resret#94110114 = null)) THEN null ELSE cast(IR_resret#94110114 as float) END AS IR_resret#94110660, CASE WHEN ((annual_retnet#94110115 = NA) OR (annual_retnet#94110115 = null)) THEN null ELSE cast(annual_retnet#94110115 as float) END AS annual_retnet#94110662, CASE WHEN ((std_retnet#94110116 = NA) OR (std_retnet#94110116 = null)) THEN null ELSE cast(std_retnet#94110116 as float) END AS std_retnet#94110665, CASE WHEN ((Sharpe_retnet#94110117 = NA) OR (Sharpe_retnet#94110117 = null)) THEN null ELSE cast(Sharpe_retnet#94110117 as float) END AS Sharpe_retnet#94110667, CASE WHEN ((PctPos_retnet#94110118 = NA) OR (PctPos_retnet#94110118 = null)) THEN null ELSE cast(PctPos_retnet#94110118 as float) END AS PctPos_retnet#94110669, CASE WHEN ((TR_retnet#94110119 = NA) OR (TR_retnet#94110119 = null)) THEN null ELSE cast(TR_retnet#94110119 as float) END AS TR_retnet#94110672, CASE WHEN ((IR_retnet#94110120 = NA) OR (IR_retnet#94110120 = null)) THEN null ELSE cast(IR_retnet#94110120 as float) END AS IR_retnet#94110674, ... 2 more fields] : +- FileScan csv [sector_id#94110097,retIC#94110098,resretIC#94110099,numcos#94110100,numdates#94110101,annual_bmret#94110102,annual_ret#94110103,std_ret#94110104,Sharpe_ret#94110105,PctPos_ret#94110106,TR_ret#94110107,IR_ret#94110108,annual_resret#94110109,std_resret#94110110,Sharpe_resret#94110111,PctPos_resret#94110112,TR_resret#94110113,IR_resret#94110114,annual_retnet#94110115,std_retnet#94110116,Sharpe_retnet#94110117,PctPos_retnet#94110118,TR_retnet#94110119,IR_retnet#94110120,... 2 more fields] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/tm1/eatm1_score/stats_..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:s... +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7513991] +- *(1) Filter isnotnull(sector_id#93880529) +- InMemoryTableScan [sector_id#93880529, sort#93880530, description#93880532], [isnotnull(sector_id#93880529)] +- InMemoryRelation [sector_id#93880529, sort#93880530, description#93880532, universe#93880534], StorageLevel(disk, memory, deserialized, 1 replicas) +- *(1) Project [CASE WHEN ((sector_id#93880497 = NA) OR (sector_id#93880497 = null)) THEN null ELSE cast(sector_id#93880497 as int) END AS sector_id#93880529, CASE WHEN (sort#93880499 = null) THEN null ELSE sort#93880499 END AS sort#93880530, CASE WHEN (description#93880501 = null) THEN null ELSE description#93880501 END AS description#93880532, CASE WHEN ((universe#93880503 = NA) OR (universe#93880503 = null)) THEN null ELSE cast(universe#93880503 as int) END AS universe#93880534] +- FileScan csv [sector_id#93880497,sort#93880499,description#93880501,universe#93880503] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> ,None), [sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST] (3) InMemoryTableScan Output [4]: [coverage#94110678, numcos#94110587, numdates#94110588, sector_id#94110582] Arguments: [coverage#94110678, numcos#94110587, numdates#94110588, sector_id#94110582], [isnotnull(sector_id#94110582)] (4) InMemoryRelation Arguments: [sector_id#94110582, retIC#94110583, resretIC#94110586, numcos#94110587, numdates#94110588, annual_bmret#94110589, annual_ret#94110616, std_ret#94110617, Sharpe_ret#94110618, PctPos_ret#94110645, TR_ret#94110646, IR_ret#94110647, annual_resret#94110649, std_resret#94110651, Sharpe_resret#94110654, PctPos_resret#94110656, TR_resret#94110658, IR_resret#94110660, annual_retnet#94110662, std_retnet#94110665, Sharpe_retnet#94110667, PctPos_retnet#94110669, TR_retnet#94110672, IR_retnet#94110674, ... 2 more fields], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((sector_id#94110097 = NA) OR (sector_id#94110097 = null)) THEN null ELSE cast(sector_id#94110097 as int) END AS sector_id#94110582, CASE WHEN ((retIC#94110098 = NA) OR (retIC#94110098 = null)) THEN null ELSE cast(retIC#94110098 as float) END AS retIC#94110583, CASE WHEN ((resretIC#94110099 = NA) OR (resretIC#94110099 = null)) THEN null ELSE cast(resretIC#94110099 as float) END AS resretIC#94110586, CASE WHEN ((numcos#94110100 = NA) OR (numcos#94110100 = null)) THEN null ELSE cast(numcos#94110100 as float) END AS numcos#94110587, CASE WHEN ((numdates#94110101 = NA) OR (numdates#94110101 = null)) THEN null ELSE cast(numdates#94110101 as int) END AS numdates#94110588, CASE WHEN ((annual_bmret#94110102 = NA) OR (annual_bmret#94110102 = null)) THEN null ELSE cast(annual_bmret#94110102 as float) END AS annual_bmret#94110589, CASE WHEN ((annual_ret#94110103 = NA) OR (annual_ret#94110103 = null)) THEN null ELSE cast(annual_ret#94110103 as float) END AS annual_ret#94110616, CASE WHEN ((std_ret#94110104 = NA) OR (std_ret#94110104 = null)) THEN null ELSE cast(std_ret#94110104 as float) END AS std_ret#94110617, CASE WHEN ((Sharpe_ret#94110105 = NA) OR (Sharpe_ret#94110105 = null)) THEN null ELSE cast(Sharpe_ret#94110105 as float) END AS Sharpe_ret#94110618, CASE WHEN ((PctPos_ret#94110106 = NA) OR (PctPos_ret#94110106 = null)) THEN null ELSE cast(PctPos_ret#94110106 as float) END AS PctPos_ret#94110645, CASE WHEN ((TR_ret#94110107 = NA) OR (TR_ret#94110107 = null)) THEN null ELSE cast(TR_ret#94110107 as float) END AS TR_ret#94110646, CASE WHEN ((IR_ret#94110108 = NA) OR (IR_ret#94110108 = null)) THEN null ELSE cast(IR_ret#94110108 as float) END AS IR_ret#94110647, CASE WHEN ((annual_resret#94110109 = NA) OR (annual_resret#94110109 = null)) THEN null ELSE cast(annual_resret#94110109 as float) END AS annual_resret#94110649, CASE WHEN ((std_resret#94110110 = NA) OR (std_resret#94110110 = null)) THEN null ELSE cast(std_resret#94110110 as float) END AS std_resret#94110651, CASE WHEN ((Sharpe_resret#94110111 = NA) OR (Sharpe_resret#94110111 = null)) THEN null ELSE cast(Sharpe_resret#94110111 as float) END AS Sharpe_resret#94110654, CASE WHEN ((PctPos_resret#94110112 = NA) OR (PctPos_resret#94110112 = null)) THEN null ELSE cast(PctPos_resret#94110112 as float) END AS PctPos_resret#94110656, CASE WHEN ((TR_resret#94110113 = NA) OR (TR_resret#94110113 = null)) THEN null ELSE cast(TR_resret#94110113 as float) END AS TR_resret#94110658, CASE WHEN ((IR_resret#94110114 = NA) OR (IR_resret#94110114 = null)) THEN null ELSE cast(IR_resret#94110114 as float) END AS IR_resret#94110660, CASE WHEN ((annual_retnet#94110115 = NA) OR (annual_retnet#94110115 = null)) THEN null ELSE cast(annual_retnet#94110115 as float) END AS annual_retnet#94110662, CASE WHEN ((std_retnet#94110116 = NA) OR (std_retnet#94110116 = null)) THEN null ELSE cast(std_retnet#94110116 as float) END AS std_retnet#94110665, CASE WHEN ((Sharpe_retnet#94110117 = NA) OR (Sharpe_retnet#94110117 = null)) THEN null ELSE cast(Sharpe_retnet#94110117 as float) END AS Sharpe_retnet#94110667, CASE WHEN ((PctPos_retnet#94110118 = NA) OR (PctPos_retnet#94110118 = null)) THEN null ELSE cast(PctPos_retnet#94110118 as float) END AS PctPos_retnet#94110669, CASE WHEN ((TR_retnet#94110119 = NA) OR (TR_retnet#94110119 = null)) THEN null ELSE cast(TR_retnet#94110119 as float) END AS TR_retnet#94110672, CASE WHEN ((IR_retnet#94110120 = NA) OR (IR_retnet#94110120 = null)) THEN null ELSE cast(IR_retnet#94110120 as float) END AS IR_retnet#94110674, ... 2 more fields] +- FileScan csv [sector_id#94110097,retIC#94110098,resretIC#94110099,numcos#94110100,numdates#94110101,annual_bmret#94110102,annual_ret#94110103,std_ret#94110104,Sharpe_ret#94110105,PctPos_ret#94110106,TR_ret#94110107,IR_ret#94110108,annual_resret#94110109,std_resret#94110110,Sharpe_resret#94110111,PctPos_resret#94110112,TR_resret#94110113,IR_resret#94110114,annual_retnet#94110115,std_retnet#94110116,Sharpe_retnet#94110117,PctPos_retnet#94110118,TR_retnet#94110119,IR_retnet#94110120,... 2 more fields] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/tm1/eatm1_score/stats_..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:s... ,None) (5) Scan csv Output [26]: [sector_id#94110097, retIC#94110098, resretIC#94110099, numcos#94110100, numdates#94110101, annual_bmret#94110102, annual_ret#94110103, std_ret#94110104, Sharpe_ret#94110105, PctPos_ret#94110106, TR_ret#94110107, IR_ret#94110108, annual_resret#94110109, std_resret#94110110, Sharpe_resret#94110111, PctPos_resret#94110112, TR_resret#94110113, IR_resret#94110114, annual_retnet#94110115, std_retnet#94110116, Sharpe_retnet#94110117, PctPos_retnet#94110118, TR_retnet#94110119, IR_retnet#94110120, turnover#94110121, coverage#94110122] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/output/tm1/eatm1_score/stats_sector_id.csv] ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:string,annual_ret:string,std_ret:string,Sharpe_ret:string,PctPos_ret:string,TR_ret:string,IR_ret:string,annual_resret:string,std_resret:string,Sharpe_resret:string,PctPos_resret:string,TR_resret:string,IR_resret:string,annual_retnet:string,std_retnet:string,Sharpe_retnet:string,PctPos_retnet:string,TR_retnet:string,IR_retnet:string,turnover:string,coverage:string> (6) Project [codegen id : 1] Output [26]: [CASE WHEN ((sector_id#94110097 = NA) OR (sector_id#94110097 = null)) THEN null ELSE cast(sector_id#94110097 as int) END AS sector_id#94110582, CASE WHEN ((retIC#94110098 = NA) OR (retIC#94110098 = null)) THEN null ELSE cast(retIC#94110098 as float) END AS retIC#94110583, CASE WHEN ((resretIC#94110099 = NA) OR (resretIC#94110099 = null)) THEN null ELSE cast(resretIC#94110099 as float) END AS resretIC#94110586, CASE WHEN ((numcos#94110100 = NA) OR (numcos#94110100 = null)) THEN null ELSE cast(numcos#94110100 as float) END AS numcos#94110587, CASE WHEN ((numdates#94110101 = NA) OR (numdates#94110101 = null)) THEN null ELSE cast(numdates#94110101 as int) END AS numdates#94110588, CASE WHEN ((annual_bmret#94110102 = NA) OR (annual_bmret#94110102 = null)) THEN null ELSE cast(annual_bmret#94110102 as float) END AS annual_bmret#94110589, CASE WHEN ((annual_ret#94110103 = NA) OR (annual_ret#94110103 = null)) THEN null ELSE cast(annual_ret#94110103 as float) END AS annual_ret#94110616, CASE WHEN ((std_ret#94110104 = NA) OR (std_ret#94110104 = null)) THEN null ELSE cast(std_ret#94110104 as float) END AS std_ret#94110617, CASE WHEN ((Sharpe_ret#94110105 = NA) OR (Sharpe_ret#94110105 = null)) THEN null ELSE cast(Sharpe_ret#94110105 as float) END AS Sharpe_ret#94110618, CASE WHEN ((PctPos_ret#94110106 = NA) OR (PctPos_ret#94110106 = null)) THEN null ELSE cast(PctPos_ret#94110106 as float) END AS PctPos_ret#94110645, CASE WHEN ((TR_ret#94110107 = NA) OR (TR_ret#94110107 = null)) THEN null ELSE cast(TR_ret#94110107 as float) END AS TR_ret#94110646, CASE WHEN ((IR_ret#94110108 = NA) OR (IR_ret#94110108 = null)) THEN null ELSE cast(IR_ret#94110108 as float) END AS IR_ret#94110647, CASE WHEN ((annual_resret#94110109 = NA) OR (annual_resret#94110109 = null)) THEN null ELSE cast(annual_resret#94110109 as float) END AS annual_resret#94110649, CASE WHEN ((std_resret#94110110 = NA) OR (std_resret#94110110 = null)) THEN null ELSE cast(std_resret#94110110 as float) END AS std_resret#94110651, CASE WHEN ((Sharpe_resret#94110111 = NA) OR (Sharpe_resret#94110111 = null)) THEN null ELSE cast(Sharpe_resret#94110111 as float) END AS Sharpe_resret#94110654, CASE WHEN ((PctPos_resret#94110112 = NA) OR (PctPos_resret#94110112 = null)) THEN null ELSE cast(PctPos_resret#94110112 as float) END AS PctPos_resret#94110656, CASE WHEN ((TR_resret#94110113 = NA) OR (TR_resret#94110113 = null)) THEN null ELSE cast(TR_resret#94110113 as float) END AS TR_resret#94110658, CASE WHEN ((IR_resret#94110114 = NA) OR (IR_resret#94110114 = null)) THEN null ELSE cast(IR_resret#94110114 as float) END AS IR_resret#94110660, CASE WHEN ((annual_retnet#94110115 = NA) OR (annual_retnet#94110115 = null)) THEN null ELSE cast(annual_retnet#94110115 as float) END AS annual_retnet#94110662, CASE WHEN ((std_retnet#94110116 = NA) OR (std_retnet#94110116 = null)) THEN null ELSE cast(std_retnet#94110116 as float) END AS std_retnet#94110665, CASE WHEN ((Sharpe_retnet#94110117 = NA) OR (Sharpe_retnet#94110117 = null)) THEN null ELSE cast(Sharpe_retnet#94110117 as float) END AS Sharpe_retnet#94110667, CASE WHEN ((PctPos_retnet#94110118 = NA) OR (PctPos_retnet#94110118 = null)) THEN null ELSE cast(PctPos_retnet#94110118 as float) END AS PctPos_retnet#94110669, CASE WHEN ((TR_retnet#94110119 = NA) OR (TR_retnet#94110119 = null)) THEN null ELSE cast(TR_retnet#94110119 as float) END AS TR_retnet#94110672, CASE WHEN ((IR_retnet#94110120 = NA) OR (IR_retnet#94110120 = null)) THEN null ELSE cast(IR_retnet#94110120 as float) END AS IR_retnet#94110674, CASE WHEN ((turnover#94110121 = NA) OR (turnover#94110121 = null)) THEN null ELSE cast(turnover#94110121 as float) END AS turnover#94110676, CASE WHEN ((coverage#94110122 = NA) OR (coverage#94110122 = null)) THEN null ELSE cast(coverage#94110122 as float) END AS coverage#94110678] Input [26]: [sector_id#94110097, retIC#94110098, resretIC#94110099, numcos#94110100, numdates#94110101, annual_bmret#94110102, annual_ret#94110103, std_ret#94110104, Sharpe_ret#94110105, PctPos_ret#94110106, TR_ret#94110107, IR_ret#94110108, annual_resret#94110109, std_resret#94110110, Sharpe_resret#94110111, PctPos_resret#94110112, TR_resret#94110113, IR_resret#94110114, annual_retnet#94110115, std_retnet#94110116, Sharpe_retnet#94110117, PctPos_retnet#94110118, TR_retnet#94110119, IR_retnet#94110120, turnover#94110121, coverage#94110122] (7) ColumnarToRow [codegen id : 2] Input [4]: [coverage#94110678, numcos#94110587, numdates#94110588, sector_id#94110582] (8) Filter [codegen id : 2] Input [4]: [coverage#94110678, numcos#94110587, numdates#94110588, sector_id#94110582] Condition : isnotnull(sector_id#94110582) (9) Project [codegen id : 2] Output [5]: [sector_id#94110582, numcos#94110587, numdates#94110588, coverage#94110678, round((cast(numcos#94110587 as double) / cast(coverage#94110678 as double)), 0) AS universe#94110747] Input [4]: [coverage#94110678, numcos#94110587, numdates#94110588, sector_id#94110582] (10) InMemoryTableScan Output [3]: [sector_id#93880529, sort#93880530, description#93880532] Arguments: [sector_id#93880529, sort#93880530, description#93880532], [isnotnull(sector_id#93880529)] (11) InMemoryRelation Arguments: [sector_id#93880529, sort#93880530, description#93880532, universe#93880534], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((sector_id#93880497 = NA) OR (sector_id#93880497 = null)) THEN null ELSE cast(sector_id#93880497 as int) END AS sector_id#93880529, CASE WHEN (sort#93880499 = null) THEN null ELSE sort#93880499 END AS sort#93880530, CASE WHEN (description#93880501 = null) THEN null ELSE description#93880501 END AS description#93880532, CASE WHEN ((universe#93880503 = NA) OR (universe#93880503 = null)) THEN null ELSE cast(universe#93880503 as int) END AS universe#93880534] +- FileScan csv [sector_id#93880497,sort#93880499,description#93880501,universe#93880503] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> ,None) (12) Scan csv Output [4]: [sector_id#93880497, sort#93880499, description#93880501, universe#93880503] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv] ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> (13) Project [codegen id : 1] Output [4]: [CASE WHEN ((sector_id#93880497 = NA) OR (sector_id#93880497 = null)) THEN null ELSE cast(sector_id#93880497 as int) END AS sector_id#93880529, CASE WHEN (sort#93880499 = null) THEN null ELSE sort#93880499 END AS sort#93880530, CASE WHEN (description#93880501 = null) THEN null ELSE description#93880501 END AS description#93880532, CASE WHEN ((universe#93880503 = NA) OR (universe#93880503 = null)) THEN null ELSE cast(universe#93880503 as int) END AS universe#93880534] Input [4]: [sector_id#93880497, sort#93880499, description#93880501, universe#93880503] (14) Filter [codegen id : 1] Input [3]: [sector_id#93880529, sort#93880530, description#93880532] Condition : isnotnull(sector_id#93880529) (15) BroadcastExchange Input [3]: [sector_id#93880529, sort#93880530, description#93880532] Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7513991] (16) BroadcastHashJoin [codegen id : 2] Left keys [1]: [sector_id#94110582] Right keys [1]: [sector_id#93880529] Join condition: None (17) Project [codegen id : 2] Output [7]: [sector_id#94110582, numcos#94110587, numdates#94110588, sort#93880530, description#93880532, universe#94110747, coverage#94110678] Input [8]: [sector_id#94110582, numcos#94110587, numdates#94110588, coverage#94110678, universe#94110747, sector_id#93880529, sort#93880530, description#93880532] (18) Exchange Input [7]: [sector_id#94110582, numcos#94110587, numdates#94110588, sort#93880530, description#93880532, universe#94110747, coverage#94110678] Arguments: rangepartitioning(sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7513996] (19) Sort [codegen id : 3] Input [7]: [sector_id#94110582, numcos#94110587, numdates#94110588, sort#93880530, description#93880532, universe#94110747, coverage#94110678] Arguments: [sort#93880530 ASC NULLS FIRST, description#93880532 ASC NULLS FIRST], true, 0 (20) CollectLimit Input [7]: [sector_id#94110582, numcos#94110587, numdates#94110588, sort#93880530, description#93880532, universe#94110747, coverage#94110678] Arguments: 1000000