== Physical Plan == CollectLimit (20) +- InMemoryTableScan (1) +- InMemoryRelation (2) +- * Sort (19) +- Exchange (18) +- * Project (17) +- * BroadcastHashJoin Inner BuildRight (16) :- * Project (9) : +- * Filter (8) : +- * ColumnarToRow (7) : +- InMemoryTableScan (3) : +- InMemoryRelation (4) : +- * Project (6) : +- Scan csv (5) +- BroadcastExchange (15) +- * Filter (14) +- InMemoryTableScan (10) +- InMemoryRelation (11) +- * Project (13) +- Scan csv (12) (1) InMemoryTableScan Output [7]: [sector_id#94368013, numcos#94368019, numdates#94368101, sort#94160419, description#94160423, universe#94368286, coverage#94368142] Arguments: [sector_id#94368013, numcos#94368019, numdates#94368101, sort#94160419, description#94160423, universe#94368286, coverage#94368142] (2) InMemoryRelation Arguments: [sector_id#94368013, numcos#94368019, numdates#94368101, sort#94160419, description#94160423, universe#94368286, coverage#94368142], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(3) Sort [sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST], true, 0 +- Exchange rangepartitioning(sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7534490] +- *(2) Project [sector_id#94368013, numcos#94368019, numdates#94368101, sort#94160419, description#94160423, universe#94368286, coverage#94368142] +- *(2) BroadcastHashJoin [sector_id#94368013], [sector_id#94160418], Inner, BuildRight, false :- *(2) Project [sector_id#94368013, numcos#94368019, numdates#94368101, coverage#94368142, round((cast(numcos#94368019 as double) / cast(coverage#94368142 as double)), 0) AS universe#94368286] : +- *(2) Filter isnotnull(sector_id#94368013) : +- *(2) ColumnarToRow : +- InMemoryTableScan [coverage#94368142, numcos#94368019, numdates#94368101, sector_id#94368013], [isnotnull(sector_id#94368013)] : +- InMemoryRelation [sector_id#94368013, retIC#94368015, resretIC#94368017, numcos#94368019, numdates#94368101, annual_bmret#94368107, annual_ret#94368108, std_ret#94368109, Sharpe_ret#94368110, PctPos_ret#94368111, TR_ret#94368112, IR_ret#94368113, annual_resret#94368114, std_resret#94368115, Sharpe_resret#94368116, PctPos_resret#94368117, TR_resret#94368118, IR_resret#94368119, annual_retnet#94368120, std_retnet#94368121, Sharpe_retnet#94368131, PctPos_retnet#94368135, TR_retnet#94368137, IR_retnet#94368139, ... 2 more fields], StorageLevel(disk, memory, deserialized, 1 replicas) : +- *(1) Project [CASE WHEN ((sector_id#94367759 = NA) OR (sector_id#94367759 = null)) THEN null ELSE cast(sector_id#94367759 as int) END AS sector_id#94368013, CASE WHEN ((retIC#94367760 = NA) OR (retIC#94367760 = null)) THEN null ELSE cast(retIC#94367760 as float) END AS retIC#94368015, CASE WHEN ((resretIC#94367761 = NA) OR (resretIC#94367761 = null)) THEN null ELSE cast(resretIC#94367761 as float) END AS resretIC#94368017, CASE WHEN ((numcos#94367762 = NA) OR (numcos#94367762 = null)) THEN null ELSE cast(numcos#94367762 as float) END AS numcos#94368019, CASE WHEN ((numdates#94367763 = NA) OR (numdates#94367763 = null)) THEN null ELSE cast(numdates#94367763 as int) END AS numdates#94368101, CASE WHEN ((annual_bmret#94367764 = NA) OR (annual_bmret#94367764 = null)) THEN null ELSE cast(annual_bmret#94367764 as float) END AS annual_bmret#94368107, CASE WHEN ((annual_ret#94367765 = NA) OR (annual_ret#94367765 = null)) THEN null ELSE cast(annual_ret#94367765 as float) END AS annual_ret#94368108, CASE WHEN ((std_ret#94367766 = NA) OR (std_ret#94367766 = null)) THEN null ELSE cast(std_ret#94367766 as float) END AS std_ret#94368109, CASE WHEN ((Sharpe_ret#94367767 = NA) OR (Sharpe_ret#94367767 = null)) THEN null ELSE cast(Sharpe_ret#94367767 as float) END AS Sharpe_ret#94368110, CASE WHEN ((PctPos_ret#94367768 = NA) OR (PctPos_ret#94367768 = null)) THEN null ELSE cast(PctPos_ret#94367768 as float) END AS PctPos_ret#94368111, CASE WHEN ((TR_ret#94367769 = NA) OR (TR_ret#94367769 = null)) THEN null ELSE cast(TR_ret#94367769 as float) END AS TR_ret#94368112, CASE WHEN ((IR_ret#94367770 = NA) OR (IR_ret#94367770 = null)) THEN null ELSE cast(IR_ret#94367770 as float) END AS IR_ret#94368113, CASE WHEN ((annual_resret#94367771 = NA) OR (annual_resret#94367771 = null)) THEN null ELSE cast(annual_resret#94367771 as float) END AS annual_resret#94368114, CASE WHEN ((std_resret#94367772 = NA) OR (std_resret#94367772 = null)) THEN null ELSE cast(std_resret#94367772 as float) END AS std_resret#94368115, CASE WHEN ((Sharpe_resret#94367773 = NA) OR (Sharpe_resret#94367773 = null)) THEN null ELSE cast(Sharpe_resret#94367773 as float) END AS Sharpe_resret#94368116, CASE WHEN ((PctPos_resret#94367774 = NA) OR (PctPos_resret#94367774 = null)) THEN null ELSE cast(PctPos_resret#94367774 as float) END AS PctPos_resret#94368117, CASE WHEN ((TR_resret#94367775 = NA) OR (TR_resret#94367775 = null)) THEN null ELSE cast(TR_resret#94367775 as float) END AS TR_resret#94368118, CASE WHEN ((IR_resret#94367776 = NA) OR (IR_resret#94367776 = null)) THEN null ELSE cast(IR_resret#94367776 as float) END AS IR_resret#94368119, CASE WHEN ((annual_retnet#94367777 = NA) OR (annual_retnet#94367777 = null)) THEN null ELSE cast(annual_retnet#94367777 as float) END AS annual_retnet#94368120, CASE WHEN ((std_retnet#94367778 = NA) OR (std_retnet#94367778 = null)) THEN null ELSE cast(std_retnet#94367778 as float) END AS std_retnet#94368121, CASE WHEN ((Sharpe_retnet#94367779 = NA) OR (Sharpe_retnet#94367779 = null)) THEN null ELSE cast(Sharpe_retnet#94367779 as float) END AS Sharpe_retnet#94368131, CASE WHEN ((PctPos_retnet#94367780 = NA) OR (PctPos_retnet#94367780 = null)) THEN null ELSE cast(PctPos_retnet#94367780 as float) END AS PctPos_retnet#94368135, CASE WHEN ((TR_retnet#94367781 = NA) OR (TR_retnet#94367781 = null)) THEN null ELSE cast(TR_retnet#94367781 as float) END AS TR_retnet#94368137, CASE WHEN ((IR_retnet#94367782 = NA) OR (IR_retnet#94367782 = null)) THEN null ELSE cast(IR_retnet#94367782 as float) END AS IR_retnet#94368139, ... 2 more fields] : +- FileScan csv [sector_id#94367759,retIC#94367760,resretIC#94367761,numcos#94367762,numdates#94367763,annual_bmret#94367764,annual_ret#94367765,std_ret#94367766,Sharpe_ret#94367767,PctPos_ret#94367768,TR_ret#94367769,IR_ret#94367770,annual_resret#94367771,std_resret#94367772,Sharpe_resret#94367773,PctPos_resret#94367774,TR_resret#94367775,IR_resret#94367776,annual_retnet#94367777,std_retnet#94367778,Sharpe_retnet#94367779,PctPos_retnet#94367780,TR_retnet#94367781,IR_retnet#94367782,... 2 more fields] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/risk_factors/growth/st..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:s... +- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7534485] +- *(1) Filter isnotnull(sector_id#94160418) +- InMemoryTableScan [sector_id#94160418, sort#94160419, description#94160423], [isnotnull(sector_id#94160418)] +- InMemoryRelation [sector_id#94160418, sort#94160419, description#94160423, universe#94160424], StorageLevel(disk, memory, deserialized, 1 replicas) +- *(1) Project [CASE WHEN ((sector_id#94160398 = NA) OR (sector_id#94160398 = null)) THEN null ELSE cast(sector_id#94160398 as int) END AS sector_id#94160418, CASE WHEN (sort#94160399 = null) THEN null ELSE sort#94160399 END AS sort#94160419, CASE WHEN (description#94160400 = null) THEN null ELSE description#94160400 END AS description#94160423, CASE WHEN ((universe#94160401 = NA) OR (universe#94160401 = null)) THEN null ELSE cast(universe#94160401 as int) END AS universe#94160424] +- FileScan csv [sector_id#94160398,sort#94160399,description#94160400,universe#94160401] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> ,None), [sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST] (3) InMemoryTableScan Output [4]: [coverage#94368142, numcos#94368019, numdates#94368101, sector_id#94368013] Arguments: [coverage#94368142, numcos#94368019, numdates#94368101, sector_id#94368013], [isnotnull(sector_id#94368013)] (4) InMemoryRelation Arguments: [sector_id#94368013, retIC#94368015, resretIC#94368017, numcos#94368019, numdates#94368101, annual_bmret#94368107, annual_ret#94368108, std_ret#94368109, Sharpe_ret#94368110, PctPos_ret#94368111, TR_ret#94368112, IR_ret#94368113, annual_resret#94368114, std_resret#94368115, Sharpe_resret#94368116, PctPos_resret#94368117, TR_resret#94368118, IR_resret#94368119, annual_retnet#94368120, std_retnet#94368121, Sharpe_retnet#94368131, PctPos_retnet#94368135, TR_retnet#94368137, IR_retnet#94368139, ... 2 more fields], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((sector_id#94367759 = NA) OR (sector_id#94367759 = null)) THEN null ELSE cast(sector_id#94367759 as int) END AS sector_id#94368013, CASE WHEN ((retIC#94367760 = NA) OR (retIC#94367760 = null)) THEN null ELSE cast(retIC#94367760 as float) END AS retIC#94368015, CASE WHEN ((resretIC#94367761 = NA) OR (resretIC#94367761 = null)) THEN null ELSE cast(resretIC#94367761 as float) END AS resretIC#94368017, CASE WHEN ((numcos#94367762 = NA) OR (numcos#94367762 = null)) THEN null ELSE cast(numcos#94367762 as float) END AS numcos#94368019, CASE WHEN ((numdates#94367763 = NA) OR (numdates#94367763 = null)) THEN null ELSE cast(numdates#94367763 as int) END AS numdates#94368101, CASE WHEN ((annual_bmret#94367764 = NA) OR (annual_bmret#94367764 = null)) THEN null ELSE cast(annual_bmret#94367764 as float) END AS annual_bmret#94368107, CASE WHEN ((annual_ret#94367765 = NA) OR (annual_ret#94367765 = null)) THEN null ELSE cast(annual_ret#94367765 as float) END AS annual_ret#94368108, CASE WHEN ((std_ret#94367766 = NA) OR (std_ret#94367766 = null)) THEN null ELSE cast(std_ret#94367766 as float) END AS std_ret#94368109, CASE WHEN ((Sharpe_ret#94367767 = NA) OR (Sharpe_ret#94367767 = null)) THEN null ELSE cast(Sharpe_ret#94367767 as float) END AS Sharpe_ret#94368110, CASE WHEN ((PctPos_ret#94367768 = NA) OR (PctPos_ret#94367768 = null)) THEN null ELSE cast(PctPos_ret#94367768 as float) END AS PctPos_ret#94368111, CASE WHEN ((TR_ret#94367769 = NA) OR (TR_ret#94367769 = null)) THEN null ELSE cast(TR_ret#94367769 as float) END AS TR_ret#94368112, CASE WHEN ((IR_ret#94367770 = NA) OR (IR_ret#94367770 = null)) THEN null ELSE cast(IR_ret#94367770 as float) END AS IR_ret#94368113, CASE WHEN ((annual_resret#94367771 = NA) OR (annual_resret#94367771 = null)) THEN null ELSE cast(annual_resret#94367771 as float) END AS annual_resret#94368114, CASE WHEN ((std_resret#94367772 = NA) OR (std_resret#94367772 = null)) THEN null ELSE cast(std_resret#94367772 as float) END AS std_resret#94368115, CASE WHEN ((Sharpe_resret#94367773 = NA) OR (Sharpe_resret#94367773 = null)) THEN null ELSE cast(Sharpe_resret#94367773 as float) END AS Sharpe_resret#94368116, CASE WHEN ((PctPos_resret#94367774 = NA) OR (PctPos_resret#94367774 = null)) THEN null ELSE cast(PctPos_resret#94367774 as float) END AS PctPos_resret#94368117, CASE WHEN ((TR_resret#94367775 = NA) OR (TR_resret#94367775 = null)) THEN null ELSE cast(TR_resret#94367775 as float) END AS TR_resret#94368118, CASE WHEN ((IR_resret#94367776 = NA) OR (IR_resret#94367776 = null)) THEN null ELSE cast(IR_resret#94367776 as float) END AS IR_resret#94368119, CASE WHEN ((annual_retnet#94367777 = NA) OR (annual_retnet#94367777 = null)) THEN null ELSE cast(annual_retnet#94367777 as float) END AS annual_retnet#94368120, CASE WHEN ((std_retnet#94367778 = NA) OR (std_retnet#94367778 = null)) THEN null ELSE cast(std_retnet#94367778 as float) END AS std_retnet#94368121, CASE WHEN ((Sharpe_retnet#94367779 = NA) OR (Sharpe_retnet#94367779 = null)) THEN null ELSE cast(Sharpe_retnet#94367779 as float) END AS Sharpe_retnet#94368131, CASE WHEN ((PctPos_retnet#94367780 = NA) OR (PctPos_retnet#94367780 = null)) THEN null ELSE cast(PctPos_retnet#94367780 as float) END AS PctPos_retnet#94368135, CASE WHEN ((TR_retnet#94367781 = NA) OR (TR_retnet#94367781 = null)) THEN null ELSE cast(TR_retnet#94367781 as float) END AS TR_retnet#94368137, CASE WHEN ((IR_retnet#94367782 = NA) OR (IR_retnet#94367782 = null)) THEN null ELSE cast(IR_retnet#94367782 as float) END AS IR_retnet#94368139, ... 2 more fields] +- FileScan csv [sector_id#94367759,retIC#94367760,resretIC#94367761,numcos#94367762,numdates#94367763,annual_bmret#94367764,annual_ret#94367765,std_ret#94367766,Sharpe_ret#94367767,PctPos_ret#94367768,TR_ret#94367769,IR_ret#94367770,annual_resret#94367771,std_resret#94367772,Sharpe_resret#94367773,PctPos_resret#94367774,TR_resret#94367775,IR_resret#94367776,annual_retnet#94367777,std_retnet#94367778,Sharpe_retnet#94367779,PctPos_retnet#94367780,TR_retnet#94367781,IR_retnet#94367782,... 2 more fields] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/output/risk_factors/growth/st..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:s... ,None) (5) Scan csv Output [26]: [sector_id#94367759, retIC#94367760, resretIC#94367761, numcos#94367762, numdates#94367763, annual_bmret#94367764, annual_ret#94367765, std_ret#94367766, Sharpe_ret#94367767, PctPos_ret#94367768, TR_ret#94367769, IR_ret#94367770, annual_resret#94367771, std_resret#94367772, Sharpe_resret#94367773, PctPos_resret#94367774, TR_resret#94367775, IR_resret#94367776, annual_retnet#94367777, std_retnet#94367778, Sharpe_retnet#94367779, PctPos_retnet#94367780, TR_retnet#94367781, IR_retnet#94367782, turnover#94367783, coverage#94367784] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/output/risk_factors/growth/stats_sector_id.csv] ReadSchema: struct<sector_id:string,retIC:string,resretIC:string,numcos:string,numdates:string,annual_bmret:string,annual_ret:string,std_ret:string,Sharpe_ret:string,PctPos_ret:string,TR_ret:string,IR_ret:string,annual_resret:string,std_resret:string,Sharpe_resret:string,PctPos_resret:string,TR_resret:string,IR_resret:string,annual_retnet:string,std_retnet:string,Sharpe_retnet:string,PctPos_retnet:string,TR_retnet:string,IR_retnet:string,turnover:string,coverage:string> (6) Project [codegen id : 1] Output [26]: [CASE WHEN ((sector_id#94367759 = NA) OR (sector_id#94367759 = null)) THEN null ELSE cast(sector_id#94367759 as int) END AS sector_id#94368013, CASE WHEN ((retIC#94367760 = NA) OR (retIC#94367760 = null)) THEN null ELSE cast(retIC#94367760 as float) END AS retIC#94368015, CASE WHEN ((resretIC#94367761 = NA) OR (resretIC#94367761 = null)) THEN null ELSE cast(resretIC#94367761 as float) END AS resretIC#94368017, CASE WHEN ((numcos#94367762 = NA) OR (numcos#94367762 = null)) THEN null ELSE cast(numcos#94367762 as float) END AS numcos#94368019, CASE WHEN ((numdates#94367763 = NA) OR (numdates#94367763 = null)) THEN null ELSE cast(numdates#94367763 as int) END AS numdates#94368101, CASE WHEN ((annual_bmret#94367764 = NA) OR (annual_bmret#94367764 = null)) THEN null ELSE cast(annual_bmret#94367764 as float) END AS annual_bmret#94368107, CASE WHEN ((annual_ret#94367765 = NA) OR (annual_ret#94367765 = null)) THEN null ELSE cast(annual_ret#94367765 as float) END AS annual_ret#94368108, CASE WHEN ((std_ret#94367766 = NA) OR (std_ret#94367766 = null)) THEN null ELSE cast(std_ret#94367766 as float) END AS std_ret#94368109, CASE WHEN ((Sharpe_ret#94367767 = NA) OR (Sharpe_ret#94367767 = null)) THEN null ELSE cast(Sharpe_ret#94367767 as float) END AS Sharpe_ret#94368110, CASE WHEN ((PctPos_ret#94367768 = NA) OR (PctPos_ret#94367768 = null)) THEN null ELSE cast(PctPos_ret#94367768 as float) END AS PctPos_ret#94368111, CASE WHEN ((TR_ret#94367769 = NA) OR (TR_ret#94367769 = null)) THEN null ELSE cast(TR_ret#94367769 as float) END AS TR_ret#94368112, CASE WHEN ((IR_ret#94367770 = NA) OR (IR_ret#94367770 = null)) THEN null ELSE cast(IR_ret#94367770 as float) END AS IR_ret#94368113, CASE WHEN ((annual_resret#94367771 = NA) OR (annual_resret#94367771 = null)) THEN null ELSE cast(annual_resret#94367771 as float) END AS annual_resret#94368114, CASE WHEN ((std_resret#94367772 = NA) OR (std_resret#94367772 = null)) THEN null ELSE cast(std_resret#94367772 as float) END AS std_resret#94368115, CASE WHEN ((Sharpe_resret#94367773 = NA) OR (Sharpe_resret#94367773 = null)) THEN null ELSE cast(Sharpe_resret#94367773 as float) END AS Sharpe_resret#94368116, CASE WHEN ((PctPos_resret#94367774 = NA) OR (PctPos_resret#94367774 = null)) THEN null ELSE cast(PctPos_resret#94367774 as float) END AS PctPos_resret#94368117, CASE WHEN ((TR_resret#94367775 = NA) OR (TR_resret#94367775 = null)) THEN null ELSE cast(TR_resret#94367775 as float) END AS TR_resret#94368118, CASE WHEN ((IR_resret#94367776 = NA) OR (IR_resret#94367776 = null)) THEN null ELSE cast(IR_resret#94367776 as float) END AS IR_resret#94368119, CASE WHEN ((annual_retnet#94367777 = NA) OR (annual_retnet#94367777 = null)) THEN null ELSE cast(annual_retnet#94367777 as float) END AS annual_retnet#94368120, CASE WHEN ((std_retnet#94367778 = NA) OR (std_retnet#94367778 = null)) THEN null ELSE cast(std_retnet#94367778 as float) END AS std_retnet#94368121, CASE WHEN ((Sharpe_retnet#94367779 = NA) OR (Sharpe_retnet#94367779 = null)) THEN null ELSE cast(Sharpe_retnet#94367779 as float) END AS Sharpe_retnet#94368131, CASE WHEN ((PctPos_retnet#94367780 = NA) OR (PctPos_retnet#94367780 = null)) THEN null ELSE cast(PctPos_retnet#94367780 as float) END AS PctPos_retnet#94368135, CASE WHEN ((TR_retnet#94367781 = NA) OR (TR_retnet#94367781 = null)) THEN null ELSE cast(TR_retnet#94367781 as float) END AS TR_retnet#94368137, CASE WHEN ((IR_retnet#94367782 = NA) OR (IR_retnet#94367782 = null)) THEN null ELSE cast(IR_retnet#94367782 as float) END AS IR_retnet#94368139, CASE WHEN ((turnover#94367783 = NA) OR (turnover#94367783 = null)) THEN null ELSE cast(turnover#94367783 as float) END AS turnover#94368141, CASE WHEN ((coverage#94367784 = NA) OR (coverage#94367784 = null)) THEN null ELSE cast(coverage#94367784 as float) END AS coverage#94368142] Input [26]: [sector_id#94367759, retIC#94367760, resretIC#94367761, numcos#94367762, numdates#94367763, annual_bmret#94367764, annual_ret#94367765, std_ret#94367766, Sharpe_ret#94367767, PctPos_ret#94367768, TR_ret#94367769, IR_ret#94367770, annual_resret#94367771, std_resret#94367772, Sharpe_resret#94367773, PctPos_resret#94367774, TR_resret#94367775, IR_resret#94367776, annual_retnet#94367777, std_retnet#94367778, Sharpe_retnet#94367779, PctPos_retnet#94367780, TR_retnet#94367781, IR_retnet#94367782, turnover#94367783, coverage#94367784] (7) ColumnarToRow [codegen id : 2] Input [4]: [coverage#94368142, numcos#94368019, numdates#94368101, sector_id#94368013] (8) Filter [codegen id : 2] Input [4]: [coverage#94368142, numcos#94368019, numdates#94368101, sector_id#94368013] Condition : isnotnull(sector_id#94368013) (9) Project [codegen id : 2] Output [5]: [sector_id#94368013, numcos#94368019, numdates#94368101, coverage#94368142, round((cast(numcos#94368019 as double) / cast(coverage#94368142 as double)), 0) AS universe#94368286] Input [4]: [coverage#94368142, numcos#94368019, numdates#94368101, sector_id#94368013] (10) InMemoryTableScan Output [3]: [sector_id#94160418, sort#94160419, description#94160423] Arguments: [sector_id#94160418, sort#94160419, description#94160423], [isnotnull(sector_id#94160418)] (11) InMemoryRelation Arguments: [sector_id#94160418, sort#94160419, description#94160423, universe#94160424], CachedRDDBuilder(org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer@208e3fd9,StorageLevel(disk, memory, deserialized, 1 replicas),*(1) Project [CASE WHEN ((sector_id#94160398 = NA) OR (sector_id#94160398 = null)) THEN null ELSE cast(sector_id#94160398 as int) END AS sector_id#94160418, CASE WHEN (sort#94160399 = null) THEN null ELSE sort#94160399 END AS sort#94160419, CASE WHEN (description#94160400 = null) THEN null ELSE description#94160400 END AS description#94160423, CASE WHEN ((universe#94160401 = NA) OR (universe#94160401 = null)) THEN null ELSE cast(universe#94160401 as int) END AS universe#94160424] +- FileScan csv [sector_id#94160398,sort#94160399,description#94160400,universe#94160401] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex(1 paths)[file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> ,None) (12) Scan csv Output [4]: [sector_id#94160398, sort#94160399, description#94160400, universe#94160401] Batched: false Location: InMemoryFileIndex [file:/srv/plusamp/data/default/ea-market/curate/curate_sector.csv] ReadSchema: struct<sector_id:string,sort:string,description:string,universe:string> (13) Project [codegen id : 1] Output [4]: [CASE WHEN ((sector_id#94160398 = NA) OR (sector_id#94160398 = null)) THEN null ELSE cast(sector_id#94160398 as int) END AS sector_id#94160418, CASE WHEN (sort#94160399 = null) THEN null ELSE sort#94160399 END AS sort#94160419, CASE WHEN (description#94160400 = null) THEN null ELSE description#94160400 END AS description#94160423, CASE WHEN ((universe#94160401 = NA) OR (universe#94160401 = null)) THEN null ELSE cast(universe#94160401 as int) END AS universe#94160424] Input [4]: [sector_id#94160398, sort#94160399, description#94160400, universe#94160401] (14) Filter [codegen id : 1] Input [3]: [sector_id#94160418, sort#94160419, description#94160423] Condition : isnotnull(sector_id#94160418) (15) BroadcastExchange Input [3]: [sector_id#94160418, sort#94160419, description#94160423] Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7534485] (16) BroadcastHashJoin [codegen id : 2] Left keys [1]: [sector_id#94368013] Right keys [1]: [sector_id#94160418] Join condition: None (17) Project [codegen id : 2] Output [7]: [sector_id#94368013, numcos#94368019, numdates#94368101, sort#94160419, description#94160423, universe#94368286, coverage#94368142] Input [8]: [sector_id#94368013, numcos#94368019, numdates#94368101, coverage#94368142, universe#94368286, sector_id#94160418, sort#94160419, description#94160423] (18) Exchange Input [7]: [sector_id#94368013, numcos#94368019, numdates#94368101, sort#94160419, description#94160423, universe#94368286, coverage#94368142] Arguments: rangepartitioning(sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST, 200), ENSURE_REQUIREMENTS, [id=#7534490] (19) Sort [codegen id : 3] Input [7]: [sector_id#94368013, numcos#94368019, numdates#94368101, sort#94160419, description#94160423, universe#94368286, coverage#94368142] Arguments: [sort#94160419 ASC NULLS FIRST, description#94160423 ASC NULLS FIRST], true, 0 (20) CollectLimit Input [7]: [sector_id#94368013, numcos#94368019, numdates#94368101, sort#94160419, description#94160423, universe#94368286, coverage#94368142] Arguments: 1000000