AAAI, one of the leading AI research conferences, tested AI peer reviews for over 22k submissions. The authors received human reviews as well and then answered a survey on their preferences.
The findings are quite interesting:
"The large-scale survey of AAAI-26 authors, reviewers, senior program committee members, and area chairs found that participants broadly found AI reviews useful and preferred them to human reviews on key dimensions such as technical accuracy and research suggestions, but also identified some limitations and areas for improvement including technical errors in reading some equations and tables, difficulty in prioritizing the significance of issues, and producing reviews that were longer than readers preferred"
Critique by authors is also mentioned in the paper:
"Respondents also emphasized that AI reviews had the potential to mislead reviewers and other decision-makers in the review process. There were also concerns that authors might optimize papers for AI preferences rather than scientific quality, and that reliance on these tools could lead to a long-term decline in reviewing skill. Adding to this, many respondents voiced principled objections, arguing that the use of AI undermines the trust, human effort, and essential value of the peer review process."

















