Boost Your App

Please fill in the form below and we'll call you back in no time.

I agree to the processing of personal data in accordance with the Privacy Policy

Your request has been received! We'll be in touch shortly

Oops! Something went wrong while submitting the form.

A/B Testing in ASO. What Is It and How to Conduct It in Apple’s App Store or Google Play?

Nov 1, 2023

Nataliia Kaidanovska

A/B testing in ASO is the process of comparing two or more variations of visual or textual elements to determine what the store visitors perceive as the most appealing option. You can conduct A/B testing on screenshots, icons, or textual metadata within the context of Google Play. RadASO team will take you by the hand and explain what is A/B testing in ASO, the key differences in A/B tests for the App Store and Google Play, and how to do it correctly.

Join the open ASO & User Acquisition community on Discord - ASO Busters! Here, engage in insider discussions, share insights, and collaborate with ASO and UA experts. Our channels cover the App Store, Google Play, visual ASO, ASA, UAC, Facebook, and TikTok.

A/B Testing in ASO – How it Works

Split the total number of users into two groups: A and B. Group A continues with the usual experience and sees the current screenshots. Group B receives a new experience and views fresh new test screenshots. Continue testing until you identify the group with the superior installation conversion rates.

During the test launch, there is an opportunity to select various parameters:

The percentage of users to whom it will be displayed.
Countries in which the test will be conducted.
Conditions under which the test will be considered successful.

However, setting the parameters of whom the test will be displayed to or controlling the audience demographics is impossible.

The main objective of A/B tests in ASO is to improve conversion rates in one of the variants. Sometimes, minor changes, such as a different color for the CTA button, lead to significant differences in user interaction with the application. When creating a hypothesis, specify what you will change and why.

Attribute

Value

Format

PNG

Color space

Display P3 (wide-gamut color), sRGB (color), or Gray Gamma 2.2 (grayscale).
See Color Management.

Layers

Flattened with no transparency

Resolution

Varies. See Image Size and Resolution.

Shape

Square with no rounded corners

Attribute

Value

Format

PNG

Color space

Display P3 (wide-gamut color), sRGB (color), or Gray Gamma 2.2 (grayscale).
See Color Management.

Layers

Flattened with no transparency

Resolution

Varies. See Image Size and Resolution.

Shape

Square with no rounded corners

Word

Frequency in service

Qimai(China)*

AppTweak

Mobileaction

Appfollow

课程 (course)

5387 (high)

56 (mid)

5 (low)

34 (mid)

外语 (foreign language)

5699 (high)

35 (mid)

5 (low)

34(mid)

播放 (play)

5496 (high)

35 (mid)

5 (low)

35 (mid)

*Search index in China provided by Qimai and based on multidimensional calculations such as search results, downloads and relatable keywords.

App Store

Google Play

1. Icon impact on user choice

Important, but not more than a video with screenshots, as they take up most of the screen

Icon is displayed with several other icons in search results, and therefore it has the greatest impact on the user's choice

2. Icon localization

1 icon for all locales

Can be localized for each country

3. Screenshots quantity, orientation

From 1 to 10, any orientation

From 2 to 8, any orientation

4. Screenshots the most important are the ones that are visible before you start scrolling

Vertical screenshots: 3 are visible
Horizontal screenshots: only the first one is visible

Vertical screenshots: 4 are visible
Horizontal screenshots: only the first one is visible

5. Screenshots size

Large — the content is legible

Much smaller — the captions are quite illegible

6. Video quantity

Up to 3

One video that must be uploaded to YouTube

7. Video orientation

Any orientation

Horizontal orientation prioritized

8. Video autoplay

Plays automatically for 30 seconds without sound

You need to click on it to play

9. Application cover

Displayed on the application and developer pages

There is no cover

10. Graphics update

After release only

Whenever

What to Consider when Preparing a Hypothesis for A/B Testing

‍Choose an element that will be changed (tested) and, in your opinion, will have a significant impact on users. For example, the background in one of the screenshots. The hypothesis may be that changing it will increase conversion.

Define the specifics of the change. Specifically and clearly indicate what you want to change in this element and add approximate references. For example, "replace the dark background with a light one" or "replace the background with an image of people with a solid background."
Evaluate how the change affects users. The test will not show results if only a small percentage of users notice the change.
The change should be noticeable on the first three vertical screenshots (if there's a video, on the first two). On horizontal screenshots or videos, the changes should be obvious right from the start.

Let's look at examples:

Example 1. Changes are not immediately apparent on the sixth screenshot. Most users only look at the first few and don't scroll to the end. Therefore, such a test is not useful since its results do not allow you to draw a meaningful conclusion.

Example 2. Changes are immediately noticeable on the very first, most conversion-driven screenshot. Only one crucial shift is being tested, not several simultaneously. The results of this A/B test will reveal what users find more alluring for viewing and downloading.

Device or context

Icon size

iPhone

60x60 pt (180x180 px @3x)

60x60 pt (120x120 px @2x)

iPad Pro

83.5x83.5 pt (167x167 px @2x)

iPad, iPad mini

76x76 pt (152x152 px @2x)

App Store

1024x1024 pt (1024x1024 px @1x)

Source language

Target language

How much longer (+) or shorter (–) the text in the target language is

English

French

21.18%

English

Spanish

19.52%

English

Italian

17.91%

English

Deutsch

16.67%

English

Dutch

13.80%

English

Portuguese (Portugal)

14.29%

English

Portuguese (Brazil)

12.96%

English

Polish

9.33%

English

Russian

9.11%

English

Czech

3.70%

English

Arab

–6.25%

English

Japanese

–39.68%

English

Korean

–44.04%

English

Chinese (Simplified)

–61.97%

English

Chinese (Traditional)

–63.80%

*Search Ads Popularity (SAP) – shows the popularity of the search term from 5 to 99.

A/B Test Differences Table in App Store and Google Play

Device

Spotlight icon size

Settings icon size

Notification icon size

iPhone

40x40 pt (120x120 px @3x)

29x29 pt (87x87 px @3x)

38x38 pt (114x114 px @3x)

40x40 pt (80x80 px @2x)

29x29 pt (58x58 px @2x)

38x38 pt (76x76 px @2x)

iPad Pro, iPad, iPad mini

40x40 pt (80x80 px @2x)

29x29 pt (58x58 px @2x)

38x38 pt (76x76 px @2x)

Search term

Translation

SAP* English (U.S.)

SAP* English (U.K.)

truck games

games with trucks

62

39

lorry games

games with trucks

–

32

jail games

prison games

29

8

prison games

32

24

*Search Ads Popularity (SAP) – shows the popularity of the search term from 5 to 99.

Google Play

App Store

What can be tested?

Short description
Long description
Icon
Feature graphics
Screenshots
Videos

Screenshots
Videos
Icon (has to be uploaded to the build*)

Number of simultaneously running tests

5 tests (each test is valid within a single country.

You can choose a default country test (details below): then it will run in all countries where there are no localized graphical or textual materials.

1 test (the test can be immediately extended to all countries where the application is available or opt for specific countries as needed)

The number of test variants that can be tested with the current version in the store

Compared with a maximum of 3 new variants

Can a test be launched while another item is under review?

Yes

No

Mandatory formats for screenshots uploaded to the store

6.5

6.5
5.5
12.9 (if there is an iPad version)

*Build – is a new version of the application. Updating the icon is only possible when updating the application version in the store. In other words, the term "build" refers to a specific version or variant of the application that is ready to be downloaded and installed on the users' devices. It contains all the necessary files and data for users to install and use the application.

More about optimizing graphic elements in the App Store and Google Play can be found in the article 'Graphics in Mobile App Promotion in the App Store and Google Play (ASO) – How to Optimize Graphic Elements.'

How to Publish an A/B Test in the App Store

1. Navigate to the Product Page Optimization tab in the App Store Console.

2. After naming the test, specify the type of test you are launching (A/B, A/B/B, or A/B/C test, etc.), the countries for displaying this test (by default, all 39 countries are selected), and an approximate test duration.

3. Upload your graphic materials.

For a more detailed description, read the official App Store documentation.

Device size or platform

Screenshot size

Requirement

Screenshot source

6.5 inch (iPhone 13 Pro Max, iPhone 12 Pro Max, iPhone 11 Pro Max, iPhone 11, iPhone XS Max, iPhone XR)

1284 x 2778 pixels (portrait)2778 x 1284 pixels (landscape)1242 x 2688 pixels (portrait)2688 x 1242 pixels (landscape)

Required if app runs on iPhone

Upload 6.5-inch screenshots

5.8 inch (iPhone 13 Pro, iPhone 13, iPhone 13 mini, iPhone 12 Pro, iPhone 12, iPhone 12 mini, iPhone 11 Pro, iPhone XS, iPhone X)

1170 x 2532 pixels (portrait)2532 x 1170 pixels (landscape)1125 x 2436 pixels (portrait)2436 x 1125 pixels (landscape)1080 x 2340 (portrait)2340 x 1080 (landscape)

Required if app runs on iPhone and 6.5 inch screenshots are not provided

Default: scaled 6.5-inch screenshotsAlternative: upload 5.8-inch screenshots

5.5 inch (iPhone 8 Plus, iPhone 7 Plus, iPhone 6s Plus)

1242 x 2208 pixels (portrait)2208 x 1242 pixels (landscape)

Required if app runs on iPhone

Upload 5.5-inch screenshots

5.5 inch (iPhone 8 Plus, iPhone 7 Plus, iPhone 6s Plus)

2048 x 2732 pixels (portrait)2732 x 2048 pixels (landscape)

Required if app runs on iPad

Upload 12.9-inch iPad Pro (3rd generation) screenshots

5.5 inch (iPhone 8 Plus, iPhone 7 Plus, iPhone 6s Plus)

2048 x 2732 pixels (portrait)2732 x 2048 pixels (landscape)

Required if app runs on iPad

Upload 12.9-inch iPad Pro (2nd generation) screenshots

Price

AppsFlyer

Firebase

Adjust

https://www.appsflyer.com/pricing/

https://firebase.google.com/pricing

https://www.adjust.com/pricing/

Traffic Source (Self Reporting Networks)

Facebook

Google

X (Twitter)

Apple Search Ads

Cohort reports

Impression tracking

Audience segmentation

At extra charge

Custom Dashboards

Custom Reports

At extra charge

Advertisement Cost

DAU/MAU (Stickiness)

Raw Data Export

At extra charge

API Reporting

Search term

Translation

SAP* French

SAP* French (Canada)

soldes

sale

25

5

aubainerie

sale

–

14

*Search Ads Popularity (SAP) – shows the popularity of the search term from 5 to 99.

How to Publish an A/B Test in Google Play

1. On the Store listing experiments tab in the Google Play Console, select the countries where you wish to conduct the test. Unlike the App Store, you can only choose one country for one test or opt for a test in the default country (i.e., for all countries without localized graphic or text materials, depending on what you are testing). So, determine whether the test will be conducted in the default or a specific country.

More information can be found in the official documentation.

2. Configure the metrics that affect the accuracy of the test and determine the number of downloads:

Metric aimed at users who have downloaded the application or those who downloaded and did not delete it within the first day.
The test variant you will launch (A/B, A/B/C, A/B/C/D – more information on the main differences below).
The percentage of visitors who will see the experimental variant instead of the currently active one.
The minimum difference between the new variants and the currently active variant that will determine the winner.
Confidence coefficient in the test results.

3. Determine what to test. Unlike what is the case in the App Store, you can test not only graphic elements but also text (full and short descriptions).

For A/B tests, you can only upload screenshots in one size. Google will automatically adapt them to other formats.

Device

Icon size

iPhone

180px × 180px (60pt × 60pt @3x)

120px × 120px (60pt × 60pt @2x)

iPad Pro

167px × 167px (83.5pt × 83.5pt @2x)

iPad, iPad mini

152px × 152px (76pt × 76pt @2x)

App Store

1024px × 1024px (1024pt × 1024pt @1x)

Icon format

PNG

Color models

sRGB or P3 (see "Color Management")

Layouts

Aligned without transparency

Sizes

Different. See "Image Sizing and Management"

Shape

Square without rounded corners

All the current requirements for icons are specified in the specification.

How to Prepare the Application for Testing

1. Run A/B/B tests, not A/B tests.

А – is the current variant of screenshots (or other materials for testing) that are currently in the store.

В – is the new variant of screenshots that need to be tested.

В – duplicate the screenshots to be tested.

A/B/B tests additionally confirm the likelihood of results. Ideally, in the best scenario, B1 and B2 should exhibit fairly similar performance metrics (more about this in the 'A/B Test Results' section below).

2. The test should last for at least two weeks (depending on how much traffic the application is getting).

As shown in the example below, sometimes this is not enough. The total amount of traffic was low, so two weeks turned out to be insufficient. Ambiguous results persisted for about a month. However, in one and a half months, significant improvements were observed for options B1 and B2. In total, the test lasted for more than 70 days.

3. Graphical changes should be significant.

Select a single hypothesis that has the greatest potential to impact the end-user in a significant manner.
Focus on the key changes in the first three screenshots during the hypothesis test (if there are also videos, focus on the first two). If the application has horizontal screenshots, focus on the first one.

In the case of rebranding (changing colors, fonts, characters, etc.), the screenshots should undergo a drastic transformation. This is also recommended if the previous screenshots are deemed to be unsatisfactory.

4. Cross-Marketing Activity

Consider global marketing activities. Users associate the brand with specific characters. Therefore, in all promotion channels and during tests in the store, use screenshots with the same characters.

5.Consider the strength of the brand.

A popular application (e.g., Netflix) receives the majority of its views and downloads through brand-specific search queries. Graphics have little influence on user choices. The results of such a test may not always be indicative, despite the amount of traffic and changes.

6. Cultural Localization

Pay attention to the cultural nuances of each region. Localize the language in the screenshots, add colors, elements, and individuals representative of the country. This will spark the interest of the local population.

Icon format

32-bit PNG (with alpha channel)

Color models

512 х 512 pixels

Layouts

1024 КБ

Sizes

square, without rounding and shadows (Google automatically rounds the corners and adds shadows)

All the current requirements are specified in the specification.

A/B Test Results in Google Play

Dictionary:

Audience – % of users who see the experiment.
Installers (current) – the number of actual downloads during the experiment.
Installers (scaled) – the number of downloads during the experiment divided by the audience share.
Performance – the likely change in conversion rates when applying the tested variant (the metric is available when there is enough data).

Example 1:

Most likely, test screenshots A and B will win. However, if the result in the Performance column is not entirely in the 'red' or 'green' zone, such results should not be considered 100% reliable.

Let's calculate the expected conversion change:

for Treatment B1: (-11.5 + 24.9) / 2 = 6.7
for Treatment B2: (-9.5 + 15.1) / 2 = 2.8
average of the two values: (6.7 + 2.8) / 2 = 4.75

Conversion will increase by 4.75%. If the current conversion was 30%, the projected conversion will be: 30 + (30 * 4.75 / 100) = 31.43%*

*Important! Do not add the average Performance percentage to the current conversion; instead, change the current conversion by that percentage.

Example 2:

Both variants displayed significantly negative outcomes. Conclusion: the test was unsuccessful.

Example 3.

The same test variant produces different outcomes: in V1, it results in a favorable outcome, while in V2, the opposite occurs. In such a case, calculations using the formula won't yield reliable results to base your decisions on. V1 and V2 should yield more or less similar results.

A/B Test Results in the App Store

№

Traffic

Popularity

Keyword

1

385 585

99

instagram

2

325 480

98

3

201 485

95

4

177 737

94

facebook

5

119 067

91

tik tok

6

92 427

90

ifood

7

80 473

87

tinder

8

69 379

87

capcut

9

65 120

87

nubank

10

64 602

86

uber

A/B Test Results in the App Store

Glossary:

Conversion rate – the conversion of test variants (Apple, in contrast to Google Play, displays this immediately).
Improvement – the relative difference between the variant being tested and the variant that’s currently active in the store. If you click on it, you'll see the percentage range over the entire testing period.
Confidence – the confidence level in the results of each individual variant. It should be at least 50% to reach a conclusive decision about the test.

The Confidence and conversion rate improvement indicators in the chart below demonstrate that this test is a winner.

After adopting the winning test variant, measure the conversion once again.

№

Traffic

Popularity

Keyword

Translate (EN)

1

233 158

99

snapchat

2

209 668

98

instagram

3

114 577

94

tik tok

4

92 500

93

5

70 286

91

facebook

6

59 302

90

tinder

7

51 352

89

8

42 468

87

youtube

9

36 646

86

tiktok

10

36 420

86

messenger

№

Traffic

Popularity

Keyword

Translate (EN)

1

243 626

99

instagram

2

122 131

95

snapchat

3

74 161

91

tik tok

4

67 431

91

5

55 634

89

6

40 775

87

youtube

7

39 319

87

tinder

8

37 889

87

facebook

9

37 600

86

10

26 225

84

brawl stars

№

Traffic

Popularity

Keyword

1

123 159

99

instagram

2

91 584

97

snapchat

3

52 253

93

facebook

4

28 524

89

tik tok

5

25 788

88

6

23 972

88

youtube

7

23 005

88

tinder

8

20 257

87

9

17 652

86

messenger

10

14 431

84

discord

№

Traffic

Popularity

Keyword

Translate (EN)

1

96 271

99

인스타그램

Instagram

2

66 339

97

vpn

3

47 883

94

카카오톡

KakaoTalk

4

40 069

93

유튜브

YouTube

5

39 041

93

배달의민족

Delivery ethnicity

6

33 538

92

x

7

32 794

92

트위터

Twitter

8

29 047

90

쿠팡플레이

Coupang Play

9

21 262

89

네이버

Naver

10

18 882

88

티빙

Tibing

№

Traffic

Popularity

Keyword

1

144 520

99

instagram

2

102 720

97

snapchat

3

70 760

94

facebook

4

45 775

91

tik tok

5

26 665

88

youtube

6

25 880

87

7

24 777

87

tinder

8

21 923

86

9

21 702

86

messenger

10

15 312

84

spotify

№

Traffic

Popularity

Keyword

Translate (EN)

1

38 906

98

телеграм

2

24 281

95

3

20 773

93

бравл старс

brawl stars

4

17 750

91

пабг

YouTube

5

16 308

91

дія

Delivery ethnicity

6

15 075

91

tik tok

x

7

12 534

90

instagram

Twitter

8

11 554

89

вайбер

viber

9

11 418

89

brawl stars

10

11 205

89

viber

A/B tests are an ongoing process because user preferences constantly change. Today, they might be drawn to a blue background, but later, red might receive more attention.

It's also important to evaluate the results accurately. The test winner doesn't always guarantee an improvement in conversion, and vice versa, and drawing conclusions too hastily can lead to unexpected outcomes.

A/B Testing in ASO. What Is It and How to Conduct It in Apple’s App Store or Google Play?

A/B Testing in ASO – How it Works

What to Consider when Preparing a Hypothesis for A/B Testing

A/B Test Differences Table in App Store and Google Play

How to Publish an A/B Test in the App Store

How to Publish an A/B Test in Google Play

How to Publish an A/B Test in Google Play

How to Prepare the Application for Testing

How to Prepare the Application for Testing

1. Run A/B/B tests, not A/B tests.

2. The test should last for at least two weeks (depending on how much traffic the application is getting).

3. Graphical changes should be significant.

4. Cross-Marketing Activity

5.Consider the strength of the brand.

6. Cultural Localization

A/B Test Results in Google Play

A/B Test Results in Google Play

A/B Test Results in the App Store

A/B Test Results in the App Store

More Articles