Rut Galar (MonkeyTaps): Growing to 600K MAUs using gamification

Levels Podcast · 2026-06-22 · 51 min

Substance score

49 / 100

Five dimensions, 20 points each

Insight Density10 / 20

Originality9 / 20

Guest Caliber9 / 20

Specificity & Evidence11 / 20

Conversational Craft10 / 20

Rut Galar from MonkeyTaps discusses how the Vocabulary app grew from 150K to 600K monthly active users over three years by focusing on gamification features and retention optimization, particularly through a practice game feature, streaks with intelligently-timed reminders, and careful experimentation around user interruptions.

Key takeaways

The initial simple 10-question practice game (guess the word) became their top premium converting feature immediately after launch, driving significant DAU growth during back-to-school and New Year's peaks.
Streak reminders work best when sent at users' established daily usage patterns rather than fixed times, requiring analysis of when each user naturally engages with the app.
Achievements and badges initially showed adoption only with power users and weren't worth maintaining given the low impact on core user base, leading the team to deprioritize them for now.
Subtle toast notifications for streak milestones proved better than prominent interruptions, as users often came to the app via notifications or widgets for specific reasons unrelated to streak tracking.
The team reduced feature complexity by automatically awarding streak freezes (3 freezes was their sweet spot) without user choice, and gave them invisibly at milestones to avoid notification fatigue.

Guests

Rut Galar

Topics in this episode

MonkeyTaps Vocabulary app Gamification Streaks Reminder optimization Back-to-school seasonality New Year's peaks Practice games Monetization testing AI image processing apps

What our scoring noted

Our reviewer’s read on each dimension, with quotes from the episode.

Insight Density

10 / 20

There are genuine tactical learnings scattered throughout - most notably that streaks function primarily as a reminders mechanism rather than a motivational device in themselves, and the clever expansion of the game library using time/lives pressure rather than building entirely new games. However, these insights are diluted by slow pacing, host anecdotes about their own product Trophy, and repetitive conversational filler that drops the ideas-per-minute ratio significantly.

our biggest learning was it's not about streak, it's about the fact that you have reminders to users outside of your application. And streak freezes were another way for us to also add, uh, reminders.

we're going to add more games, but not reinventing new games, but adding game features within the existing games. So that's when the concept of lives and time came into play and we launched the challenge games

Originality

9 / 20

The reframe of streaks as a reminder-delivery mechanism rather than an intrinsic engagement driver is the episode's one genuinely non-obvious insight, and the invisible freeze replenishment as a 'nice surprise' is a subtle but interesting UX principle. Beyond these, the episode largely rehearses conventional gamification wisdom - achievements, Duolingo comparisons, back-to-school seasonality - without meaningfully challenging or extending it.

our biggest learning was it's not about streak, it's about the fact that you have reminders to users outside of your application

We also experimented with the streak freezes reminders too. I think that's why we found out streak freezes were working

Guest Caliber

9 / 20

Rut Galar is a genuine practitioner who has run real experiments on a real product at modest but meaningful scale (600K MAUs), giving her concrete operational credibility. However, she is a senior PM at a small indie app studio, not an executive at significant scale, and the company's B2C consumer app context limits direct transferability for B2B operators.

last summer, summer 2020, we decided to put like a focus on vocabulary because we saw a lot of growth potential there. That's when I, for example, started focusing more on that app

we built things in one app and then we extrapolate it to the other and we experiment a lot

Specificity & Evidence

11 / 20

The episode includes some useful concrete figures - 150K to 600K MAU growth, ~20% DAU lift from the back-to-school push, ~30% uplift from New Year's iteration, 3 freezes as the sweet spot, 2 - 3 week decision cadence - but almost no hard conversion rates, retention curves, or A/B test effect sizes from individual experiments, leaving the evidence base thinner than the anecdotes suggest.

around like 20% of our increase in daos came from that, that moment. Then of course, like January 2025...that's when we get around like another 30% uplift with all those like little experiments

Our sweet spot was 3.

Conversational Craft

10 / 20

The hosts ask reasonable process-level follow-ups - pushing on how streak interruptions were resolved, how freezes replenish, how rollbacks are handled - and the Trophy product-building context gives them genuine curiosity. However, they let several vague claims pass unchallenged, spend meaningful time narrating their own experience rather than extracting the guest's, and rarely push for hard numbers or mechanisms behind the results cited.

So how did you solve that problem? Like, if someone taps the notification or the widget or something, and it takes them straight into a particular screen in the app, how do you show them that the streak has been incremented without getting in the way of what they're trying to do?

Do the users earn new freezes over time on some particular interval?

Conversation analysis

Computed from the transcript - who did the talking, and the verbal tics along the way.

Share of words spoken

Speaker B71%
Speaker A24%
Speaker C6%

Filler words

like180so142uh93um75right59kind of32you know20I mean12basically11er7sort of6actually6

Episode notes

Rut Galar is a Senior Product Manager at Monkey Tabs, a company focused on building simple mobile apps for positive thinking and personal growth. In this episode, Rut discusses the growth and development of Vocabulary, an app to help people improve their vocabulary every day, which has seen a significant increase in users from 150k to 600k MAUs in 3 years. The discussion covers various strategies employed to enhance user engagement, including gamification, user retention techniques, and the importance of feedback mechanisms. Rut shares insights on experimentation and A/B testing, as well as future plans for incorporating leaderboards and social features into the app. The conversation highlights the importance of understanding user behavior and the challenges of gathering meaningful feedback. Links Rut Galar Monkey Taps Powered By Trophy - the gamification layer for consumer apps.

Full transcript

51 min

Transcribed and scored by The B2B Podcast Index.

Speaker A: Do you ever get some really weird people and feedback coming through those channels? Yes, I, uh, imagine so.

Speaker B: Yes, indeed. And the videos are sometimes funny. We've got people that just sing on the videos and doesn't really answer the questions. They just sing because they think that, you know, if they just record the video they would get the subscription. So, yeah, we've had interesting. Or people just smoking or in their. Yeah, yeah, things like that.

Speaker C: Alrighty.

Speaker A: Welcome Root. So today we've got Root from Monkey Tabs. This is a company that's released a, uh, variety of different apps. Each app is sort of very simple and contained and focused on one particular thing. And Root is a senior project manager. She works on primarily the Vocabulary app, which is, as you can imagine, an app to expand your vocabulary in a particular language. And we'll talk about gamification and the product decisions that Root made along the way that increased retention and grew the app from one, uh, hundred fifty k market monthly active users three years ago to six hundred K today. So quite a big jump. Welcome Ru.

Speaker B: Thank you. Thanks. Nice for me. I'm happy to be here and uh, share with you some of our journey in vocabulary specifically, but also in Monkey taps as a whole. So maybe give a bit of an intro on what is it that we do. We're a, um, company that's mainly focused on building mobile apps. We have a private variety of around five apps that are right now focused on personal growth. Um, they're positive apps that people can use to grow in different areas. As you mentioned, there are apps that are very good at doing like one thing. We have motivation that uh, gives you quotes daily to boost your yourself I am or vocabulary, which is the one that uh, we probably would focus on more today that um, allows you to increase the lexicon on a specific language. Um, this summer we're also opening like a new line of app portfolio, uh, more into AI image processing with apps that you can already find in our uh, app store, like Isidro Toys that gives you the option to edit images around, um, toys from your kids and um, give them some life. So it's very interesting. We have other apps like Exterior for like Exterior or Garden Design, Hair studio or house AIs. They're uh, all around AI image processing so that we're diversifying that way. But uh, right now our main focus is those other simple apps for personal growth, as I mentioned.

Speaker A: Cool. So as far as I think where we should begin is, as I mentioned before, it sounds like you guys went from on the vocabulary app 150k MAUs 3 years ago to 600k today. Was that growth, what was the breakdown of that growth? Was it a big chunk of it from just getting better at like running ads and getting new people coming in and downloading the app? Or was a big chunk of it more retaining users and actually just getting much better at getting them to keep the app installed and kind of growing it that way?

Speaker B: So it's been a bit of a combo, but we did put a lot from a product perspective, we put a focus on that app because, uh, the way we operate our apps are very similar. So we build things in one app and then we extrapolate it to the other and we experiment a lot. But last summer, summer 2020, we decided to put like a focus on vocabulary because we saw a lot of growth potential there. That's when I, for example, started focusing more on that app. Whereas before we were just like building features across apps, but we had like a dedicated team focus on, okay, what is it that our vocabulary users, uh, need, uh, to be retained in the app more concretely? And that's when we launched the games feature. We had some users requests around. Okay, you're providing me a lot of content, right? You're providing me cool words that I discovered daily through widgets, through reminders, through coming to the app and just, you know, scrolling through the feed. But we want a way to retain them. And that's when the whole like, gamification level came into play and we built our first game to allow users to just play with the words that they were saving or the words that they were discovering. That was a big success. It immediately, um, converted to be our top premium converting feature post mording. And then from there on we kept investing. Of course, for us, it wasn't a random decision to put that focus on summer because we have two peak moments, which is September. So back to school mindsets for growth apps. It's very important, right? That's when people have, um, all those, you know, big intentions to change something, like I want to learn vocabulary or I want to practice affirmations, etc. So that's why we put a lot of effort on that on the summer. So that then for September, we were ready to, uh, you know, to convert that influx of people that we normally get in September. Then we did a lot of like, tweaks to it also combined with like, of course, marketing campaigns back to school, etc. But then, and that was around like 20% of our increase in daos came from that, that moment. Then of course, like January 2025. So New Year's is also a big peak for us and we did a lot of tweaks during like end of the year, um, last year to make sure that again we were ready. So with all the learnings that we had, we experimented a lot with other concepts on gamification as well that didn't work so well. Like I don't know, I remember things like um, we launched achievements, right? We know that um, um, Duolingo and many other apps are doing these kind of things. It didn't work out for us or at least the concept that we put live. So we experimented a lot around last year, uh, end of last year to be ready for this New Year's peak. And that's when we get around like another 30% uplift with all those like little experiments. And then right now this summer we've put a lot of extra effort on vocabulary as well because we, we've seen it working. We added new game dynamics, um, beginning of the years till now we've done a lot of enhancements with the streaks to find our sweet spot on how much we have to focus on showing our users tricks and how much we want to disturb their experience with it. So we've done many tests around this, this um, and of course on parallel, not only like retention focus, but there's been even like countless tests on our onboarding. So we have a super optimized onboarding. We've done a lot of pricing tests as well. So we got to a point where in this summer we have an app that has a very optimized onboarding, has the pricing that we have tested and we think it's in our sweet spot and we have like retaining features like all those uh, games. We also launched AI Voices that was also um, um, a uh, converting feature. And then we got this design award, ah, finalist from Apple, which gave us also some visibility, uh, onto the, the back to school campaign that we are just like ending now that has gave us a super pick, the biggest that we've seen in the app so far.

Speaker A: Cool. So let's go into the gamification stuff. So before summer 2024, what did the vocabulary app look like? Was it like you installed the app, you went through some onboarding and then basically you could, whenever you wanted to use the app, you would scroll through words and um, learn those words just through scrolling. Is that pretty much it?

Speaker B: Yeah, there was like some features like you could save and have like collections created or favorite that you could then revisit. But Pretty much that. That was it. Of course we had reminders and widgets which are a, ah, big part of all our apps and of course of the vocabulary app too. But there wasn't much to do in terms of like, actions from the users within the app other than just like consuming the content, organizing it and setting up things to remind you of that content outside of the app.

Speaker A: Okay, so let's go through the things you added then. What was the first addition that you added?

Speaker B: So we launched our practice feature. We started very simple by adding just like a placeholder in our main, like in the feed where users could just access a, ah, ten, uh, words game. So we tested the users in what we call the guess the word. So we give you the definition and then you have to choose among three words, 10 questions give you results. That was it. Um, we started by M just providing it for only our premium users because we wanted to understand what was the monetization factor of features like that. And then we also did some tests around what if we open it for like one time for free users or we just give them like a snapshot of the first, um, you know, questions, et cetera, et cetera. At the end with all those iterations, we left it to a point where it was only premium feature. If you're a free user, you can use it during the trial period. Basically. That was our first edition.

Speaker A: Cool. And did you like notify people to do that, like on a certain cadence, like once a day or anything like that? Or was it just there in the home screen for people to do when they wanted to?

Speaker B: So it was on the home screen, of course we added. We also experimented around how to bring people there with things like bottom sheets around. Hey, we have these new features. And to bring adoption, there was also a tool tip for new users. When you, uh, finish your onboarding, we tell you, um, combined with the save, um, we want you to start saving words that we can then use during your practice. So we combine this tooltip with, okay, first say five words and then come back to play them. And after the first practice, um, we were asking users whether they wanted to add a reminder to practice on a daily basis. Yeah.

Speaker A: So everything sort of opt in. So it sounds like you were thinking from the beginning about this gamification experiment mostly as a way to increase conversions to premium, not like to retain the premium user so much as converting them in the first place. Is that right?

Speaker B: Yeah, yeah, ah, exactly. I mean it's not one or other, of course. This is very heavily. And right now we're seeing it. Right. It's very heavily. I mean, it has to be used by premium users. So. And we have seen that our part of our retention app list have come from that. But we tend to. Right now we're starting to think, whereas before we were thinking a lot about daos for the sake of daos. Right. Like we want users to be retaining our app. We want to make sure that those users that are in our app are using our most valuable features, AKA premium features.

Speaker C: Yeah, I think the next one was going to be the one that had the achievements and maybe the streaks come in. So, yeah, it'll be good to talk about where that all came from and the types of things that you were testing around that.

Speaker A: And then before we do that, just to clarify, this initial game feature was the one that immediately, um, led to a bump, right, in conversion, Like a significant bump?

Speaker B: Yes. Okay. That like super simple 10 questions game.

Speaker C: Yeah.

Speaker A: Cool.

Speaker C: Yeah.

Speaker A: Okay, so let's move on to the next one. Was the next thing, uh, achievements or badges or was. Did the streaks come first or what was next?

Speaker B: Yeah, we did have streaks in other app in Iam, so we like. While we were testing the game part on vocabulary, we were also testing streaks in. In the IAM app. And then before getting streaks into vocabulary, we did also experiment things around achievements. Right. Or stats. A lot of times users were asking us about how much have I progressed? Things like that. And that's where the idea of achievements came into play. And we did something very simple where we gave users batches based on certain interactions. It did show some adoption for some very niche group like our power users, basically. But we didn't see much engagement from new users or our core user base and we decided not to go forward with it. I think this was a bit of a. I, uh, mean, it was more of a strategic decision because we tend to keep our apps very simple. One of our core values is minimalism. And we. Even if we see small potential on a feature, features like that, that need some kind of maintenance as well from our side. Right. How often are we going to get new batches in? How, you know, uh, all that content part, we didn't see major impact that would justify, basically, yeah. Maintaining such features. So we decided to not go forward with it. We might experiment around this this year because we have some ideas around stats, leaderboard achievements. So that's something that we would hopefully be looking forward at the end of this year. But that's something that we tested already last year and it didn't work. Out for us back then.

Speaker A: Does the vocabulary app have any social component? Like, is there any way to do anything with your friends in there or, like, learn words with friends?

Speaker B: Not at the moment.

Speaker A: Okay. Because in, uh, our experience with the achievement stuff, I think part of achievements and badges and stuff is being able to kind of show them off, and maybe that's a part of life. Like, for some apps, it works. And then for you guys, at least, where the app was last year, it didn't work yet.

Speaker B: I mean, we do have the serve, uh, component. So you can either serve words or. When we did implement these achievements, there were, uh. Of course you could share your batch on social media and stuff, but the adoption of this sharing was very, very low. Yeah, yeah.

Speaker A: It's hard to get people now to share things on social media in general, I think, because, like, I think it needs to almost be, like, a trend for people to want to, like, share this stuff. Yeah. Charlie, any questions on that?

Speaker C: Yeah. What were the types of badges that you were trying? Was it getting people to kind of do the same thing multiple times and then having kind of milestones based on that? Or was it, like, using achievements to try and explore different features? What were the kind of things that you were trying?

Speaker B: So we looked into our core features, like I said, right? We thought, what are our users doing within the app? Right? They're reading, they're saving, they're favoriting, they're playing. So we looked at, like, the actions that they were doing, and we tried to, um, establish, okay, what are the actions that are monetization actions. So, for example, if you're changing categories, if you're a free user, you don't have access to all the categories, right? So if you want to change categories multiple times and, like, explore the categories, you have to be premium user. If you want to play, you have to be premium users. So with these two angles of, like, okay, what are the actions and what are the actions that we monetize through? We, um, built, like, packages, right? So if you say five times, then you get like a collector badge. Like, oh, yeah, great collector. If you practice, uh, three times, then, um. I don't remember the names right now. That was a year ago. But, uh, we got like, um. That was more or less how, uh, how, uh, how we defined those batches. Like, trying to make sure that our users were doing the core actions on our apps, but also actions that were monetized.

Speaker C: And was the. Was the streaks kind of the next part of that then?

Speaker B: Yes.

Speaker C: Yeah.

Speaker B: So as I mentioned, uh, we did have streaks already, uh, in Iam app. Um, and there was more about. You come in the app, you, um, practice a couple of affirmations, and then we give you the streak of the day.

Speaker C: Ah.

Speaker B: So in vocabulary, we started doing the same. So you come in the app, you read three words, and then we give you, uh, your. Your, uh, um, uh, the plus one on your streak. Um, and we started like that. And then we experimented a lot around. Okay, maybe that's creating too much disturbance for the users. So what if we just give them app, uh, open. That's how it is at the moment. We discovered that streaks by the sake of a streak wasn't the point for us. It was more the fact that with streaks, we were able to set new reminders to users, remind you to come back to the app. That was exactly what was making our users return more than the streak itself. Because we did a lot of experiments with reminders through streaks. How often, what's the time? Um, how do we push people to set their reminders, Et cetera, et cetera. That was a lot around that area. And also the visualization of those streaks. Do we need to interrupt the users when they come to the app? Because one important thing from our users, from our user behavior to understand is that many of our users come from outside of our app content. And I mean, like reminders that we send them on a daily basis. Um, or the widgets that they have set up in the. In the app. Right. That's how they access a lot of the times our app. And what we found also while talking to some of our users or like through also some feedback that we got through support was that there are times when users were finding a reminder interesting. Oh, that word is interesting. Right. I'm going to go to the app and save it, for example. And then we were like, uh, you're streak. And they were like, no, that's not why I came here. Right. So we did a lot of digging, uh, on that area to find like, okay, we do want to have streaks because it was proven to us that it helped retention through those reminders. But we want to be careful with these interruptions. And how do we visualize all the streaks.

Speaker C: Yeah.

Speaker B: In the app.

Speaker A: So how did you solve that problem? Like, if someone taps the notification or the widget or something, and it takes them straight into a particular screen in the app, how do you show them that the streak has been incremented without getting in the way of what they're trying to do?

Speaker B: So at the moment, we do have, like, A toast that appears in the app. So it's very subtle. When you come in, you just see like at the top of the screen you added one more streak, but that's it. Unless you hit like a major milestone, which is something that we have experimented a lot on as well. What is important to reach the 3, the 7, the 14 day streak. What are the things that we should celebrate? And we did a lot of tests around timings and et cetera. So if you come into a normal day, you would just see them like the toast at the top of the screen. So you see it, but you can also see the content.

Speaker C: Got it.

Speaker A: And what were the results of the tests on? Um, what time to send those reminders? And do you send like one reminder a day or do you maybe send two? Like what, I'm curious where all that testing led you.

Speaker B: So we have different streak related reminders. So we have like the day, the day reminder, the night reminder. So depending on whether you've interacted with, with the reminder or you've come to the app in the morning at midday or at night, we have different batches of reminders to tell you like maybe before midnight, we have like, okay, the day is about to end. Come back to the app to uh, keep your streak. So we did a lot on um, like okay, when, when are the right times to, to add those, those reminders? And also time zone related. So the bedtime, like the comeback to the uh, app before bedtime. For example, at first we, we started around 10pm or something like that. And then we realized most of our taps on those reminders were coming on the next day. So people was already missing this trick. And then we realized, wait a minute, uh, we're Spanish, right? So we go to bed pretty much late, but maybe in the US 10pm is too late. Yeah, uh, so we did a bit on that too.

Speaker A: M got it. So for someone who is engaging with the app in the morning, are you gonna, you send them a shriek reminder already, like in the afternoon and then if you, if they don't extend the shriek, maybe you send them another one in the evening at like 8pm or something. Is that how it works?

Speaker B: So if you've already come in the morning and you have already got the plus one on your streak, then we won't remind you. But in the next morning if you haven't come, and we know that you came in the morning, so that, that came from, from learning that most of the users come back to the app, um, the same times. They're apps for Self growth. Right. And a lot of the times users type the usage of the apps to their daily routine. So when we talk to users, they tell us, oh, no, you, you know, using the app is part of my morning routine or is part of something I do on my lunch break or before bedtime. So we tried to send the reminders at the same time. So at the time when we know that they were using the app the previous day. So if today you came in the morning, then the next morning you would receive a reminder. If you haven't come, starting the cycle of like, okay, if you haven't come in the afternoon, you get the afternoon, or at night, et cetera. And then depending on when you come that day, we would, um, set the reminder for you the next day, more or less. That's how it works.

Speaker A: That's interesting. Yeah, I mean, uh, Charlie and I have been building streaks for trophy for our customers, and we've been slowly adding this type of. Basically the same process that you guys have followed. Like, we are following that process ourselves. So we started with just the basic streak with a simple reminder that goes out like at 8pm and then we added like the time zoning and making sure that the time zones are perfect for all the use in different time zones. And now I think the next step will be, I mean, we just added freezes, which I meant to ask you about. But, uh, now we're figuring out like, the optimizing the send time on those, on those streak reminders. And so we'll have to do something similar where it's like, yeah, analyze the usage patterns, if there's a consistent pattern, then try to send off the back of that pattern. So freezes. Did you guys experiment with freezes?

Speaker B: We did, yes, more or less. Same as with the whole streaks mechanism. Um, we found out that less destruction was what was working best for us. We experiment with, okay, what if we give them strict freezes, but we give them the option to choose to use them or not use them. For example, that was a test that we ran and we decided to just give them the freezes, uh, automatically without users having to choose to freeze their streak. We also experimented with how many freezes to give. 3. 5. 2.

Speaker C: Right.

Speaker B: Our sweet spot was 3.

Speaker A: 3. Okay.

Speaker B: And we experimented with the streak freezes reminders too. I think that's why we found out streak freezes were working. As I said before, the whole streak mechanism. For us, our biggest learning was it's not about streak, it's about the fact that you have reminders to users outside of your application. And streak freezes were another way for us to also add, uh, reminders. Right. Um, come back before your freeze stops. Um, these kind of things is what worked out the best for us.

Speaker C: That m Makes sense. It's like a. Is another excuse or like a valid excuse.

Speaker B: Definitely. Yeah. To knock your door and say, hey, come back.

Speaker C: Yeah.

Speaker A: Do the users earn new freezes over time on some particular interval?

Speaker B: Yes. We also did a bit of experimentation with that on like, okay, should we tell them? How do we tell them, you know, like how many phrases you have, how many are about to earn in the future at the end? Also following this whole concept of minimalism, super simple. We just give them more freezes at ah, certain milestones. It's completely invisible for them. We don't tell them they're about to get it. It's like a nice surprise.

Speaker C: Okay.

Speaker B: Um, and yeah, that's it.

Speaker A: So you just. So if a user gets to like a seven day streak, for example, then they'll get a new freeze. They can't have more than three no matter what. Right. So if they get to seven days, then it just stays the same. Okay.

Speaker B: So we only give you extra if you've run out. Basically.

Speaker C: Yeah.

Speaker A: Makes sense.

Speaker C: This is all really interesting stuff and I guess what would you say would be a good starting point for somebody that wants to start looking into this in their app, but maybe doesn't have the know how or the bandwidth to really do all of these? Would you say just, just set up some kind of basic test and just start simple and go from there or what would be your advice?

Speaker B: Yeah, I would say like find the mechanism of when are you going to give us tricks to your users? Is it at uh, app open or like web open, whatever. Like the platform you're using? Right. Or is it based on any specific action? I think that that's pretty key because that defines a lot how much you're going to care about streaks and then definitely add uh, the reminder component to it. For us it was the biggest learning and I wouldn't leave it out of a potential MVP and just find a way to visualize it and then if you have the chance, just iterate on it till you get the point where the visualization of it doesn't interrupt your experience.

Speaker C: Sure, yeah, I think that's good advice. And what are the uh. I know I think you mentioned in our first conversation you've built a lot of the testing tools in house, but is this like you have different cohorts of people that you would say this person's going to get a notification, but these other people will leave out and we'll just kind of a B test and see what their retention difference is or how are you kind of doing those?

Speaker B: I mean, we do have grouping in our in house too. Like, we can segment. We can, we have four user groups at the moment and we can decide to just roll the test out to the four groups or 1, 2, 3 groups. Most of the times we test in our whole database, uh, unless we have multiple tests that we think are conflicting and then we choose to maybe segment one on two user groups and the other one in two in another two user groups. That's how we're using the grouping. But most of the times we test, uh, on all our user base.

Speaker C: Okay. Just kind of like for a limited period. And you'll monitor, see what happens and then maybe take it out later.

Speaker B: Yes.

Speaker C: Okay.

Speaker A: Yeah, I was going to ask about that because, uh, I think one of the big things we'll be adding to Trophy in the future will be also a kind of experimentation stack. And I'm curious, like, if you, for example, you're running a test for this half of your user base, you have one set of logic for how streaks work. Maybe they have freezes. Right. And one M1 doesn't have freezes. If you run that test for a few weeks, you run the data, you realize that freezes actually don't do anything. You want to get rid of freezes. Do you worry about, like, all those users getting upset that they were, you know, they got used to having freezes and now you're taking them away, or do you let them continue with that old functionality, but then you have to maintain it forever? Like, how do you manage that?

Speaker B: We roll back and we roll back and then we say sorry, in case there has been times. And that's something we deal with in our support, with our support team or in app reviews, when users are just asking, uh, us for specific features that maybe we have tested for a while and then we got rid of. Then we say, you know, like, thanks for the feedback. If we see that there's a lot of feedback, then that's also insight for us to say, hey, maybe we didn't interpret the results well enough, you know, because there's also some qualitative part of feedback that is coming through our, uh, way. Most of the cases there has been a couple of complaints or so, uh, about features that we've removed and then it hasn't been a big deal, especially because, yeah, it's what you already said right in the question. We don't want to maintain features that we have in sense that are not working, and we have to put effort on maintaining just for that small group of people. What we do sometimes as well is pause the test. So if we find out that something is not working, uh, as we wanted, and it would end up in a rollback, but we want to iterate on it instead of just rolling it back and then do another test on top, we normally just pause the test so that the users that have already entered that experience wouldn't leave. Uh-huh. That. Right. And wait for the iteration, too. So we sort of have that group of people frozen, let's say, and then we test with the remaining group. And if we find, um, a more. Yeah. Uh, um, a better way, then we roll out to all, um, our users. But we. Yeah, that's something.

Speaker A: So just to give a concrete example, like, let's say you started one test where you had one group was on no street freezes, one group was on freezes, and then you weren't quite sure, so you wanted to run another test maybe where, you know, group three, they get a different number of freezes or something. Yeah, you do that. And then if that one succeeded, then you'd migrate the other groups all to what Group 3 has. So the people that had freezes already would maybe have a smaller number of them, but, like, the experience would mostly be continuous.

Speaker B: Yes.

Speaker A: Is that right?

Speaker B: Yeah, yeah, exactly.

Speaker A: That makes sense. Yeah. Yeah. Um, Charlie, questions on experimentation before we continue?

Speaker C: Yeah. There was one more. You mentioned you had four categories of users. How did you kind of decide what those categories were? Was it based on usage pattern or kind of location or anything like that?

Speaker B: No, they're completely random.

Speaker C: Totally random.

Speaker B: And that's on purpose so that we can compare the experiences, um, truthfully. Right, right.

Speaker C: Okay. So four just kind of is the right number to give you enough options to play around with.

Speaker B: Most of the times we use it, um, when we have tests that we want to test in parallel and in conflict, and most of the times we use, like, group one and group two in one, Group three and group four in two. So basically, with two groups, it could have been enough for us, but we have four, just in case.

Speaker C: Do people kind of know about this practice that you're doing? I think one thing that Duolingo does quite well is they're quite open about the fact that they do all this testing. They talk about it in duocon and other things like that, and they kind of tell you, oh, if you happen to see this, you're in the lucky cohort. Do you tell people kind of about that ahead of time? We're going to do this and that maybe helps you roll it back as well a little bit. I don't know if that's something you thought of.

Speaker B: Yeah, we don't have that. I mean, we're transparent with that. Whenever a user is coming to us and saying, hey, I'm experiencing this and I'm, um, not liking it, or I miss this feature that we had and now you don't have, we're very transparent in saying, you know what, we built things, um, be iterating and testing and our product teams are testing things constantly. So that's why you are seeing that experience and now you're not seeing it and things like that. So we're transparent with it, with the user, but we don't promote it in any way in the app. Just saying, like, we're testing this.

Speaker A: Yeah, because I feel like that goes a little bit against your brand. Right. Like Duolingo is very like talky, outgoing brand, but it seems like your apps are more like you said, minimalist, like a little bit more laid back. Like you send the required, uh, notifications to kind of nudge people to do what they're there to do. But otherwise probably you don't make a lot of noise. Right.

Speaker B: And also, like, I don't know, I would be a bit, yeah. Concerned about that. Right. Because imagine that I'm putting things like Achievements live and I'm already telling the users, hey, you're the lucky one seeing achievements. Um, I'm already, you know, biasing them to, to test something because, oh, I'm the lucky one that has been chosen to test this. So I'm going to go and check. So then when I look into the metrics, I see adoption numbers that I don't know what would then be if I wouldn't have told users that this was something uniquely for them.

Speaker C: True.

Speaker B: So yeah, that's good.

Speaker C: Interesting.

Speaker A: How long does each test take? More or less. What's the iteration interval here?

Speaker B: Well, depends a lot on the sample and where it is, but we tend to make decisions around two, three weeks. They're tests that we've left running longer because maybe they're retention related tests that we see we need to see how the curve goes or they're in parts of the app that are not having that much traffic and then we leave them a bit, uh, for longer. But in two weeks, most of the times we make decisions.

Speaker A: Okay, good to know.

Speaker C: Yeah.

Speaker A: So after streaks, were there any other gamification elements that you've added to the vocabulary app?

Speaker B: Yeah, uh, after streaks, we put a lot of effort in games per se. So the practice part of vocabulary, as I said before, we started with just like 10 questions. Uh, game. It was only one type of game. Uh, there wasn't much more to it. Uh, even though users were loving it. But in some surveys and some conversations with users, they also tend to tell us different game dynamics that they would like to have in the app. Um, and that's what we did. We started adding new games. We launched our, what we call the game library where you can have multiple games. Um, and we launched, uh, at the beginning, I think there were four games. So the one that we had plus three and then a shuffle that was a combo of all, uh, of the games. Um, and we also gave the users the option to vote, um, for their next game. And it proved, yeah, um, it ended up being a rollout basically because it helped users retain practice. We saw that users were coming back to practice more often and there were some games that were starting to peak. Not only the game that we had at the beginning, but um, some games like Synonym matching for example, were very popular. So users were liking them a lot. And there was also an extra monetization layer that we saw that when we were showing the gallery, users were tending to uh, to monetize more. We thought because of the fact of this gives me access to more things, not just like a game. Right. It gives me access to multiple games so I'm more willing to pay for it.

Speaker C: Makes sense. Yeah.

Speaker A: I expect people see more long term value.

Speaker C: Right.

Speaker A: Because it's like if they see one game and it's like, maybe I'll get sick of that in a week. So I don't want to pay for a subscription, but if they see a whole library of them, it's like, oh, I could get entertainment and usage out of this for a longer period.

Speaker B: And then on top of that, I think this was his summer. Yeah. So it's quite recent. I think a couple of m months ago or so we started adding game dynamics to our existing games because there was a point where we hit like, okay, this whole game library is becoming a crucial part of our app. As I said, one of our most monetizing feature post onboarding. And our premium users are very much retained through that practice feature. But to maintain it, to keep on adding more games, it's. It has an extra cost. Right. So what we did was, okay, we're going to add more games, but not reinventing new games, but adding game features within the existing games. So that's when the concept of lives and time came into play and we launched the challenge games which are essentially the same game dynamics that we had, but with lives or time, um, pressure and they have become the most popular ones in our, in our library.

Speaker A: So that's, it's kind of presented almost as another like group of games, but really it's the same games with some added elements. That makes sense. Yeah, it's a way to double the size of the library, but it actually doesn't double the size of the work for you guys. So it makes a lot of sense.

Speaker B: Yeah.

Speaker A: Uh, and it feels like there's a natural path from that to leaderboards to maybe some social element. If, you know, if this is something that you expect people to uh, have friends that are also doing the same thing, like using the same vocabulary app. Uh, because now that everyone's playing games and the games actually have either some sort of score element or some sort of like time based thing, then now you have a way to compare users against each other. Right. And is that, is that the next thing on the horizon?

Speaker B: Indeed, yeah, indeed. That's what we're focusing in right now. In the, in the next coming months, hopefully people start seeing some of those tests running. Uh, but yeah, the whole concept of where was I before I started playing? Where am I now? How am um, I compared to myself and other peers. Right. And that's what we're exploring now into like adding stats level, that kind of things. Leaderboards maybe getting back into our achievements, more connected to the practice. Maybe it's a better time for us to, to, to do um, to launch features like that.

Speaker A: Makes sense. Yeah. I mean how are you thinking about the, the leaderboards thing? Because we've talked to a couple different people that have experimented with leaderboards actually recently we talked to was it program is Charlie that had the leaderboards thing? Yeah, the big platform for learning how to code. And they introduced leaderboards first as like a global leaderboard where you just see how you were doing in your programming lessons relative to like every other user. And they found that it wasn't super effective in general. People were kind of. Because imagine you're a new user, right? You, you on board and you do a couple lessons or whatever and then you look at the leaderboard and you're like all the way at the bottom. Right. Because you just started and that's, you know, that's not nice for users to See that. So the way that they, what they've iterated on is basically having, um, a cohort of like you and your friends that are on a leaderboard together. And so then instead of seeing like random people that are way better than you, you see like your friends and how you, how you compare against them. So I'm curious, how are you planning to like batch users so that they're on a leaderboard? That doesn't look too scary.

Speaker B: Yeah, so at the moment we don't know yet because there are just concepts and conversations that we're having, but definitely that was one of the things that were we're discussing on. Okay. Um, especially from, as you said. Right. Comparing new users to users that have been using the app for two years is not that fair. Um, sometimes we've talked about, what if we compare them with levels. Right. People in your level of usage more than friends. Because if adding the friends component to it have other more technical implications that maybe we don't want to get into until we prove the value of features like leaderboards. So we're trying to find, okay, what is it within the data that we have already from our users that we can do? So maybe not compare you with the whole user base, but compare you with people that have been in the app the same amount of time or have, uh, had the same amount of reads or interactions within the app. Uh, that's something that we're thinking on, but as I said, it's all a bit in the whole ideation process. Yeah, yeah.

Speaker A: That's similar to the, to the duolingo approach, I think, where you basically get placed into a league and they show you at the end of each lesson, like where you are, and then if you do really well, you move on to the next league and it's like you get another batch of user that you're compared against. So you're always like somewhere in the middle because if you're too far at the bottom, they'll just move you to a lower league and then you'll be in the middle. So it's like you pretty much always end up somewhere near the middle. So. Yeah, that's interesting. So the, the friends aspect sounds like you guys aren't super interested in yet, or at least it's. It would require a lot more work. So maybe it's harder.

Speaker B: It's not a focus for us at the moment because I think we don't have much built on that yet. And I think that's more like maybe long term. I think first we need to put we don't even have any kind of stats in the application. There's no way for our users to know how much they read, how much they've practiced, um, you know, um, even look at the score on their practices from a retrospective way. We give them a score at the end of each practice, but there's no way for our users to go back and see, oh, um, I've scored five out of 10 today, but I did score three out of 10 yesterday. So we're in such basis that the connection to users is like on another level. Right. We need to start from a, um, way more basic way.

Speaker A: One of the big interesting things about Monkey Taps is how you have all of these different apps, right? And so what you've learned from the vocabulary app, you're probably going to want to roll out to maybe some of your other apps that are similar. So I'm curious, like, how easy is that to do? Is it like you're for example, your game system, did you guys design and build that in a way that you could kind of put it into a new app? Just maybe with slightly different like content in the games besides, you know, words, it's something else. Like, how did you plan for that?

Speaker B: Yeah, definitely. So it's one of our core, I don't know, success story. I don't know how to say. But yeah, it is how we built, um, it gives us the option to just put it in another app, uh, like copy paste almost. And that's how we build also, uh, games in vocabulary. So at the moment we're also exploring and we have already launched this in the IAM app. If you go to the IAM app right now, you would see the practice feature already there a similar experience to, to vocabulary. I mean, in terms of there's only one practice. The experience is a bit different. You would see content you don't play like choose the affirmation. But the concept of let's give this a place in our main. Let's make it only premium transporting from vocabulary to the IAM app. And we're also exploring how do we bring a gallery into iam, a gallery of practices following the same logic that we follow follow in vocabulary, something that it's also easily maintainable over time, but that gives also users the feeling that, okay, there's a variety of practices that I can get access to within my premium subscription. And that's what we are, we're doing right now. So at the moment we have those two focuses, like the level, stats, leaderboard that we talked about, but also the how do we get all the learners from vocabulary into I am, which is very transferable. The users need to practice their affirmations, not only consume the content, but we need to find ways to help them put their focus on and access different kind of practices from different times of the day or different needs on terms of content. Right. Are you needing affirmations because you're in a stressful moment or because you need to boost your self confidence and self love? So these kind of thoughts are the ones that are more into the whole practice part in the IAM app.

Speaker A: Do you have to now then retest everything though? Because it's a different set of users and maybe something that works for vocabulary won't work for affirmations. And how much of the learnings from the vocab tests do you then apply straight away into the other apps as opposed to testing them straight away?

Speaker B: Almost never. I mean, and I say almost because there has been times we depending on. So things like tooltips, uh, things that I remember, like we experimented with like maybe a year and a half or two years ago on tooltips on different. I don't know where the favorite is or where the save is. Maybe we tested it in one app and then we transform it to another. Um, those were like simple tests that we said it's very rare that the users in that other app behave very differently. But things like practice in iam, practice in vocabulary, the users are completely different. The app is different. So we do test it as well because we want to understand the behavior of the users in that specific app. So yeah, we will retest it. We are actually.

Speaker A: So the system you have for sharing these features in between your different apps, it's flexible enough that you can make some significant changes based on from one app to the other if you have to based on the test results. Cool. I mean, it sounds like you guys have built a quite, quite complicated, like an effective system for sharing all that logic and stuff between different apps, but still letting each one have its own take on it.

Speaker B: I guess the challenge for every product manager out there, or anyone that's building apps, is finding out apps or digital products. Sorry, I'm very biased to apps right now, but digital product or any product in reality is what is it that I should be building? And for us, there's a big part, of course, looking into the performance of our apps and understanding, you know, our data and learning from all those a B test that we have been mentioning during the whole conversation. But there's also a big factor of Being or uh, trying to be very close to our users. And there are certain things that we have done across the company to make sure that happens and then. Yeah, understanding. Okay, where is it that our users are talking to us? Right? We have different ways. We have them talking to us through rates and reviews in the App Store. So let's make sure that we internally are the ones that are reading those uh, reviews and replying to those reviews. So we have a monthly rotation where anyone in the company is not product. It's not like any of us in the company, uh, regardless of the function, has the duty to be looking into those rates and reviews during a whole week and reply to uh, our users. Of course we have things automatized so that this is not like a very painful task to do, but it's our way to, to hear what the users are saying. And we then share these through like different channels in our Slack channels on. Okay, users are talking about that this is our way to find also maybe bugs or things that are not working. That's on uh, rates and reviews. Right. Another way that we have our users talking to us is through support. So they, they email us or they use our chatbot. So that's not extended to the whole company because developers need to develop. Right? They're the key guys that make everything work. So we kept it for only the product team. So we're retained in every. So for example, myself, uh, today while I'm talking to you, I'm also the support representative in Monkey Tabs answering the emails from the users. That's also a way to know what's going on. And now users are very vocal. I don't know, I guess they're vocal in many other places too. But we have very good feedback from our users coming through our support teams. And then we establish what you mentioned, the banana talks. So we have a uh, block in our calendars every Wednesday too. So today is the lucky day because we're um, in the afternoon where um, we, we leave it for users to book a slot to talk to us. And we come in those chats with either like um, interview script, very generic on how they're using our app, or maybe we use it to show different concepts that we're trying to define and get some of this feedback. This has been a very challenging thing because uh, we just have a section in the app that says earn a premium subscription. And then before they were able to just book a slot with us, if they come to the interview, we have a chat and we give them a lifetime Ah, subscription. That's how it works. What we found out is that a lot of users are subscribing but not showing up. Uh, which was a bit frustrating for us because there was a lot of time invested in joining the course, preparing, et cetera for them not to show up. Um, and we recently launched um, Video Ask, uh, survey. So uh, it's powered by Typeform and it's basically a video interview where we record ourselves asking the questions and they record themselves answering the questions. Um, and that way we're getting way more feedback. We still have the slots on Wednesday for them to sign up for like a deeper interview if they're interested to talk to us. That's also our way to also build a bit of this community sense. We have often users that email us through support and say I have this feedback. And they do this recurring every couple of months. So that's our way to hey, if you want to come and chat with us, you can during this, these slots. And we also have a uh, team rotation on that. Every week is a different person looking at the videos and gathering those insights and sharing it with the team. So we have these three sort of sources of user feedback, the reviews, the uh, support emails and also the video interviews or the in person interviews.

Speaker C: Yeah, that's great. Can you kind of choose who you're speaking to or is it mainly just anybody that's happy to turn up and give you some kind of feedback?

Speaker B: Anybody really? Yeah, we would love. In the past we did some user testing more using usertesting, uh.com, which is like a tool that you can like you pay a subscription and the users get paid and then you can also segment like I want users that have been using this app for a specific period that have this but don't have these other kind of apps. But yeah, uh, we decided not to, to put our money there anymore and do it ourselves. And any kind of feedback, anyone that wants to join and gives us their feedback is good enough.

Speaker A: You ever get some really weird people and feedback coming through those channels? Yes, yeah, I imagine so.

Speaker B: Yes, yeah, yeah, yeah, indeed. And the videos, uh, are sometimes funny. Um, we've got people that just sing on the videos and doesn't really answer answer the questions, they just sing because they think that, you know, uh, if they just record the video they would get the, the subscription. So yeah, we've had interesting. Or people just smoking or in their. Yeah, yeah, things like that. But it's also funny.

Speaker C: Yeah, shows that they really want the product. Then if they're happy just to sit there.

Speaker B: Yeah.

Speaker A: Yeah. And sing for it. That was perfect. One hour on the dot? Pretty much. All right, well, thank you for taking the time.

Speaker B: Thank you.

Speaker A: That was a great chat.

More from Levels Podcast

All episodes →

Explore the best B2B Leadership podcasts →

All Levels Podcast episodes →