How to use Group By and Having Clause SQL query in magento 2?

Magento Group By SQL query used to fetch the same set of values from a specified column. While We use group by clause with aggregate function

We must have to use the Having clause and where clause is not used with Group By Aggregrate functions.

Aggregate functions like MIN, MAX, COUNT, AVG, SUM, etc…

Let’s take an example to find duplicate emails from the customer_entity table.

Duplicate email might be available if you use multiple websites with scope as a Website. Each website has its own email id and one website email is not working for the other websites.

Here we will find the duplicate email count using group by and having SQL clause,

<?php declare(strict_types=1);

namespace Rbj\Customer\Model;

use Magento\Customer\Model\ResourceModel\Customer\CollectionFactory;

class General
{
    /**
     * @var CollectionFactory
     */
    private $customerRepository;

    public function __construct(
        CollectionFactory $customerCollectionFactory
    ) {
        $this->customerCollectionFactory = $customerCollectionFactory;
    }

    public function findDuplicateEmail()
    {
        $customerCollection = $this->customerCollectionFactory->create();

        $customerCollection->getSelect()
            ->columns(['emailCount' => 'COUNT(e.entity_id)'])
            ->group('email')
            ->having('emailCount > ?', 1);
        return $customerCollection;
    }
}

Here we added a customer collection factory to the constructor to fetch customer collection.

We are using the group() clause with the email provided to find the email field from the customer_entity table.

We use the having() clause because we have used count aggregation in our query. we are checking emailCount conditions for the emailCount is available more than 1.