Customer Retention

Hello,
I’m trying to make a retention report. Here is a sample file. My data model has about 22 data tables and I cleaned up as much as I could to explain the problem I’m having. Also I left the relationships as they are in my original file.
Retention.pbix (2.9 MB)

image
New Customers column works fine in my actual file. I just can’t figure out the next columns. Below, I have tried to simplify the columns.

The table I’m trying to create is like this. We have 13 cycles in a year. I’m trying to get them on the ‘Cycle’ column.

New Customers Enrolled in 2020-C10’ column should have new customers who enrolled in 2020-C10.
Customers that enrolled in 2020-C10 placed an order in Cycle 11’ column should have, out of those 228 enrolled in 2020-C10 that placed an order in 2020-C11.
Customers that enrolled in 2020-C10 placed an order in Cycle 12’ column should have, out of those 228 enrolled in 2020-C10 that placed an order in 2020-C12.
Customers that enrolled in 2020-C10 placed an order in Cycle 13’ column should have, out of those 228 enrolled in 2020-C10 that placed an order in 2020-C13.

So on and so forth. Of course I need to have C1-C13 horizontally.

I have been working on this for many days and really appreciate any help. :pray:t5:

Thank you so much!

Hi @nipunadv,

I have concerns about your model, in my opinion you don’t need these bi-directional relationships and have changed them to single direction.

I would also suggest to bring CALENDAR_CYCLES and CALENDAR-DAYS together into a single Date dimension table. So you’ll end up with a start schema.

Next I created this measure (returns the same result as your version)

New Cust = 
VAR myCust = 
    CALCULATETABLE( 
        VALUES( CUSTOMERS[CustomerID] ), 
        USERELATIONSHIP( CALENDAR_CYCLES[PeriodID], CUSTOMERS[EntryPeriodID] )
    )
RETURN

COUNTROWS( myCust )

and for the comparison measures, repeated this pattern.

2020-C10 Cycle +1 = 
VAR FirstCycle = MINX( ALLSELECTED( CALENDAR_CYCLES[CycleText] ), [CycleText] )
VAR _Year = MAX( CALENDAR_CYCLES[Year] )
VAR _Cycle = MAX( CALENDAR_CYCLES[Cycle] )
VAR FirstCust =
    CALCULATETABLE( 
        VALUES( CUSTOMERS[CustomerID] ), 
        USERELATIONSHIP( CALENDAR_CYCLES[PeriodID], CUSTOMERS[EntryPeriodID] ),
        CALENDAR_CYCLES[CycleText] = FirstCycle
    )
VAR CustOrders = 
    CALCULATETABLE( 
        VALUES( SALES[CustomerID] ), 
        CALENDAR_CYCLES[Year] = _Year,
        CALENDAR_CYCLES[Cycle] = _Cycle +1,
        ALL( CALENDAR_CYCLES[CycleText] )
    )
VAR _Result = INTERSECT( FirstCust, CustOrders )
RETURN

COUNTROWS( _Result )

With this result
image

That doesn’t match your expected result but is accurate when checked in excel.
image

Here’s your sample file. eDNA Retention.pbix (2.9 MB)
I hope this is helpful.

5 Likes

Hi @Melissa,

Thank you so much for your response. I will make a note of these bi-directional relationships. I changed them to bi-directional at the beginning because some numbers were not coming out right. Right now it does not let me change the directions and give me “The cardinality you selected isn’t valid for this relationship”. My model pretty big. image
Also I will try to combine the Date table into a single Date table.

I created 13 measures (for 13 Cycles) as you advised and I am not getting any numbers :pensive:

Bi-directional relationships are the only things that are different from the sample file you sent and the original file. Really appreciate any suggestions.

Thanks again!

@nipunadv, that looks absolutely frightening…

So here’s something to think about before you move on because I can’t make out anything or zoom into the depicted model but it looks like a “one model to answer all your questions” which isn’t a good design principle.

Each model should be designed to answer a small set of specific (related) questions.

Only bring what you absolutely need and strip away all redundant columns from your tables.

DAX is optimized for star schema’s maybe you can de-normalize other tables as well to create a single DIM table like combining the CALENDAR_CYCLES and CALENDAR-DAYS in order to get a single Date dimension table which has a unique key at the lowest granularity, the date level.
Note that for DAX time intelligence functions to work there are a number of other requirements for this specific dimension table, see this post here.

In your sample file you had BLANKS in the CUSTOMER Dim table, a Key should be unique and shouldn’t contain blanks - that will prevent a 1:M relationship

Modelling, unless you have a really deep understanding of the implications of using M:M relationships (see post here) and you can’t avoid it with bridge tables for example - then be my quest - in all other cases go the extra mile to create a 1:M relationship.

To understand the flow of filters throughout a model, place them in a waterfall like fashion. All DIM tables on top and all FACT tables below. Hide all Key fields (in DIMs) not meant for filtering or visualization AND hide all Keys in FACTs, no exceptions. (see post here)

I hope this is helpful.

2 Likes

@Melissa Thank you so much for your feedback and advice.
Yes, we use one model to answer all the questions and we keep adding new tabs as we need to answer those questions. It’s good to know that we are not supposed to do that so we can start thinking about remaking the model.
Also we are trying to combine CALENDAR_CYCLES and CALENDAR-DAYS and have one single Date Table. At this point, the model does not let me change any M:M relationships I have but I will certainly look for ways to improve or re-create the model completely to a start schema. I’m not even sure if it’s possible to re-make it as a star schema with this many tables as we are trying to answer all our questions with one model.

So for the time being, what I’m trying to do is make a smaller model with tables I need to create this retention table. I made another model with same exact data tables as the example you sent me (Date table, 1:M relationships, no bi-directional relationships, etc. ) but I don’t get values.
image

I’m just wondering if I could use CycleIDimage instead of Cycle and CycleText. Since I’m trying to show, out of those new customers enrolled in the Cycles listed vertically image how many customers placed orders on each cycle horizontally like below, I feel like using only CycleID would be easier, then it would be CycleID+1, CycleID+2, so on…

So I created another measure using only CycleID and it does not give me any values.
Here’s the measure and the file.eDNA Retention-v2.pbix (2.9 MB)

Cycle +1 (CycleID) = 
VAR FirstCycle = MINX( ALLSELECTED( CALENDAR_CYCLES[CycleID] ), [CycleID] )
VAR _Cycle = MAX( CALENDAR_CYCLES[CycleID] )
VAR FirstCust =
    CALCULATETABLE( 
        VALUES( CUSTOMERS[CustomerID] ), 
        USERELATIONSHIP( CALENDAR_CYCLES[PeriodID], CUSTOMERS[EntryPeriodID] ),
        CALENDAR_CYCLES[CycleID] = FirstCycle
    )
VAR CustOrders = 
    CALCULATETABLE( 
        VALUES( SALES[CustomerID] ), 
        CALENDAR_CYCLES[CycleID] = _Cycle +1,
        ALL( CALENDAR_CYCLES[CycleID] )
    )
VAR _Result = INTERSECT( FirstCust, CustOrders )
RETURN

COUNTROWS( _Result )

Again, thank you so much for helping me to figure this. Really appreciate your help!

Hi @nipunadv,

So I’ve attempted to turn you data into a start schema but several issues with this sample dataset became apparent… I have dealt with that BUT you’ll have to check and devise a way to deal with that in your production model that is in accordance with your corporate logic.

Here are some of the issues in my view:

  1. Cycle table includes the same Enddate as Startdate for a new period, so that date is present in two periods. My solution. Kept the Startdate and excluded the Enddate, this generates 7 days for the record.
    image
  2. Based on these ranges I’ve created a record for each date but there’re not enough records to cover an entire year, which is a date table requirement in order for Time Intelligence functions to work (see link prev reply)
  3. Your DIM Customer table contained blanks. My solution. I removed all blank records
    image
  4. Your Customer table contains an invalid date. My solution. Removed errors.
    image
  5. Your Customer table includes an EntryPeriodID and EnrollmentDate but the EntryPeriodID doesn’t match the PeriodID based on the EnrollmentDate. As shown here this date is assigned to multiple PeriodID’s that doesn’t seem correct.
    image
  6. The same issue is present in the FACT Sales table, where the OrderDate doesn’t match the assigned PeriodID, if checked against the date table.

For both issue 5 and 6 ideally you would only keep the Date field and lookup the PeriodID in the Date table. Unless of course I interpreted this wrong and there is no relation between these 2 fields…

This is what the new model looks like
image

And these are the new measures, for new customers used TREATAS because there is no relationship between the DIM Date and DIM Customer.

New Cust = 
VAR myCust = 
    CALCULATETABLE( 
        VALUES( CUSTOMERS[CustomerID] ), 
        TREATAS( VALUES( Dates[Date] ), CUSTOMERS[EnrollmentDate] )
    )
RETURN

COUNTROWS( myCust )

and

Cycle +1 (CycleID) = 
VAR FirstCycle = MINX( ALLSELECTED( Dates[Cycle] ), [Cycle] )
VAR _Cycle = MAX( Dates[Cycle] )
VAR FirstCust =
    CALCULATETABLE( 
        VALUES( CUSTOMERS[CustomerID] ), 
        TREATAS( VALUES( Dates[Date] ), CUSTOMERS[EnrollmentDate] ),
        Dates[Cycle] = FirstCycle
    )
VAR CustOrders = 
    CALCULATETABLE( 
        VALUES( SALES[CustomerID] ), 
        Dates[Cycle] = _Cycle +1,
        ALL( Dates[CycleText] )
    )
VAR _Result = INTERSECT( FirstCust, CustOrders )
RETURN

COUNTROWS( _Result )

However it seems you still have a lot of work ahead of you before you can get back to the DAX analysis…

Here’s my reworked file, I hope you find some elements useful - just add the CSV folder path to the list in the FileFolderLocation parameter and all Queries will be restored.

CUSTOMERS.csv (53.5 KB)
ORDERS.csv (19.8 KB)
SALES.csv (37.5 KB)

eDNA - Retention cleaned.pbix (2.9 MB)

We don’t offer consulting on the forum, please mark this thread as solved and if you need further in depth help sorting your model and developing reports please contact a BI professional for assistance.

1 Like

Hi @Melissa,

Thank you so much for the detailed explanation. Really appreciate your help!

1 Like