Skip to Content
Technical Articles

Populate Dates Between Start & End Dates

Requirement

This blog is intended to show how we can fill the dates if dates are saved in two columns as StartDate or EndDate ( Refer Sample Data of Date Rang Table Screenshot).

We have one Transaction table which is having data for each day (Example: PGI_DATE).

In another table, data is entered by date range.

Sample Data of Date Range Table

Sample Data From Transaction Table

 

Now requirement is to populate PERIODID in the output if our transaction date falls under the given date range of any PERIODID.

Example –

  1. If PGI_DATE is 01-Dec-2019 then Pick Period ID 53119
  2. If PGI_DATE is 02-Dec-2019  or 03-Dec-2019 then Pick Period ID 53120

Solution 1 – Using Cursor in Table Function

We can create table function and use cursor in the table function. Here is the code.

FUNCTION "AP_CUSTOM"."KCC.AP_CUSTOM.IBP::TF_IBP_DATE_PERIOD_RANGE" ( )
       RETURNS TABLE (PERIODSTART DATE, PERIODEND DATE, PERIODID VARCHAR (5), DAY_DATE DATE)
       LANGUAGE SQLSCRIPT
       SQL SECURITY INVOKER AS
BEGIN

/*****************************
       Write your function logic
******************************/

DECLARE v_StartDT DATE;
DECLARE v_EndDT DATE;
DECLARE v_PeriodID VARCHAR (5);
DECLARE v_FinalResult TABLE (PERIODSTART DATE, PERIODEND DATE, PERIODID VARCHAR (5), DAY_DATE DATE);

-- Declare Cursor to Fetch Records from Date Range Table
DECLARE CURSOR c_Date FOR SELECT PERIODSTART,PERIODEND,PERIODID FROM AP_CUSTOM.AP_IBP_CT_PERIOD_ID;

-- Read Cursor Row One by One. 
FOR cur_row as c_Date DO

-- Select PERIODSTART & PERIODEND From Date Range Table which coming in Each Cursor Row
SELECT PERIODSTART, PERIODEND, PERIODID INTO v_StartDT, v_EndDT, v_PeriodID FROM AP_CUSTOM.AP_IBP_CT_PERIOD_ID WHERE PERIODID = cur_row.PERIODID;

-- Select Date from Time Dimension Table by Adding Filter on DATE_SQL Column
v_TblOut = SELECT v_StartDT AS "PERIODSTART", v_EndDT AS "PERIODEND", v_PeriodID AS "PERIODID", DATE_SQL AS "DAY_DATE" FROM _SYS_BI.M_TIME_DIMENSION WHERE DATE_SQL >= v_StartDT AND DATE_SQL <= v_EndDT;

-- Insert Output Into Table Variable 
INSERT INTO :v_FinalResult SELECT PERIODSTART, PERIODEND, PERIODID, DAY_DATE FROM :v_TblOut;

END FOR;

-- Return Final Result From Table Variable (:v_FinalResult) as Output
RETURN SELECT PERIODSTART, PERIODEND, PERIODID, DAY_DATE FROM :v_FinalResult;

END;

 

Solution 2 – Using Select Statement in Table Function

Another option is to use Select Statement. But this approach has one drawback that it will Create Cross Join between Time Dimension and Date Range table.

Also, Performance will not be optimal in this as compared to approach 1.

But cursor will execute number of times same as the records available in Date Range Table.

Example:

Period in Date Range Table 10.

Data Pulled from TIME_DIMENSION Table for 1 Year (365 Day)

Total Number of Rows will be processed by Query will be – 10 * 365 = 3650

Here is the Sql Statement

-- Use this Code In Table Function

SELECT AA."PERIODSTART",AA."PERIODEND",AA."PERIODID",TT.DATE_SQL AS "DAY_DATE"
FROM XX.DATE_RANGE_TABLE AA
INNER JOIN  _SYS_BI.M_TIME_DIMENSION TT
ON (TT.DATE_SQL >= AA."PERIODSTART" AND TT.DATE_SQL <= AA."PERIODEND")
ORDER BY TT.DATE_SQL;

Join above result with Transaction table (PGI_DATE Left Outer Join On DATE_SQL) and fetch PERIODID in the output.

Please let me know what you think on this or anyone has better approach.

Thanks!

2 Comments
You must be Logged on to comment or reply to a post.
  • Hi Kapil,

    thank you for your blog. I think, neither CURSOR nor Non-Equi Join are optimal solutions I created a solution with the window function MAX() under the assumptions that:

    • the PERIODIDs are in the same order as their PERIODSTART date. Otherwise you need an additional column, that has this feature.
    • the Sample Data contains all possible PERIODSTART dates of the PERIDOS. Otherwise you have to create a complete mapping table in a previous step with all possible dates.

     

    /*****************************************************************
     * Period data
     *****************************************************************/     
     --Create a period table, PERIODEND will be skipped, because it 
    -- is redundant and can cause inconsistencies.
    create table periods(periodid nvarchar(30), 
                         periodstart date);                     
    
    -- Insert some periods in the table with 
    insert into periods  (
      select 'PER_' || lpad(element_number, 5, '0') as periodid, 
             generated_period_start                     
      from series_generate_date( 'INTERVAL 1 MONTH', 
                                 date'2000-01-01', 
                                 date'2022-01-01'));
     
    --Check the values in periods... 
    select * from periods;
    
    /*****************************************************************
     * Sample data
     *****************************************************************/                   
    create table sd(material nvarchar(18), 
                    plant    nvarchar(4), 
                    PGIDATE  nvarchar(8));
    
    insert into sd(
    select '000234234', 
           'ABCD',
           to_dats(generated_period_start) 
      from  series_generate_date( 'INTERVAL 1 DAY' , 
                                  date'2020-01-01', 
                                  date'2021-01-01')) ;
      
    --Check the sample data
    select * from sd;
    
    /*****************************************************************
     * Populate the periods
     *****************************************************************/     
    select material, 
           plant, 
           pgidate, 
           max(periodid) over (order by pgidate) as periodid_populated
      from sd as s
      left outer join periods as p
      on s.pgidate = to_dats(p.periodstart)
      order by pgidate;
      
    drop table periods;
    drop table sd;

    The window functions force the system to the row engine. But in this case, I am sure it is faster than your approach 1, because it is still declarative SQLScript and can be well optimized and paralleled. And I guess it will also be faster than your approach 2, because in my experiences, the non-equi join performs very poor with a reasonable amount of data. 

    Does anyone else have an alternative solution?

    Regards,
    Jörg