I am trying to analyse whether three groups (samples sizes 27, 26 and 23) differ in their responses to 20 questionnaire items, each measured using a 5-point likert scale. The items are all investigating different things, so it is not appropriate/of interest to condense them into likert scales for broader constructs.

I am well aware of the controversy surrounding the use of ANOVA for likert data. However, I have been advised that this could be justified, and I know of papers that suggest ANOVA is relatively robust in most situations. However, my data is also non-normal and contains multiple extreme outliers. As these outliers represent real likert responses, I believe that they have value and should not be excluded. Overall, I feel like there are a lot of arguments against using one-way ANOVAs to analyse the data at this point.

I have therefore also endeavoured to analyse the data using the Kruskal-Wallis test, which at first seemed to be more appropriate. However, the data violated the assumption that distributions across groups should be similar and consequently analysis can only tell me about the distributions of likert responses, rather than median responses which would make more interpretable sense (I think?). Following a guide to conducting and reporting the analysis, I arrived at findings like this:

*'Likert ratings increased from Group 1 (mean rank = 33.78), to Group 2 (mean rank = 38.41), to Group 3 (mean rank = 42.77), but the differences were not statistically significant, χ2(2) = 2.218, p = .330.'*

Talking about 'mean rank' seems to be irrelevant to my discussion of whether, and in what ways, groups differed in their likert responses.

My supervisor is suggesting that I stick to one-way ANOVA, but I do not see how the results of this can be justified for this data. At the same time, Kruskel-Wallis does not seem suitable for my questions of interest.

I would greatly appreciate any advise on how I can best analyse my data at this point!