**Contents**show

### 1. What is a Data Analyst’s role in a company?

**Answer:**

A Data Analyst is responsible for collecting, processing, and analyzing data to support decision-making. They extract insights from data, create reports, and assist in making data-driven decisions.

### 2. How do you import a CSV file into Python for analysis?

**Answer:**

```
import pandas as pd
data = pd.read_csv('file_path.csv')
```

**Explanation:**

This code uses the `pandas`

library to import a CSV file. The `read_csv`

function reads the file and stores it as a DataFrame for analysis.

### 3. Explain the difference between ‘group by’ and ‘having’ clause in SQL.

**Answer:**

The `GROUP BY`

clause is used to group rows that have the same values into summary rows. The `HAVING`

clause filters groups based on a condition after the `GROUP BY`

operation.

### 4. How do you handle missing values in a dataset using Python?

**Answer:**

```
# Remove rows with missing values
data.dropna()
# Fill missing values with mean
data.fillna(data.mean(), inplace=True)
```

**Explanation:**

The code snippet demonstrates two common methods: removing rows with missing values or filling them with the mean.

### 5. Explain what a pivot table is and how it can be useful.

**Answer:**

A pivot table is a data summarization tool used in spreadsheet programs. It allows users to aggregate and analyze large datasets, making it easier to derive meaningful insights and identify trends.

### 6. How do you perform data visualization in Python?

**Answer:**

```
import matplotlib.pyplot as plt
data.plot(kind='bar')
plt.show()
```

**Explanation:**

This code snippet uses the `matplotlib`

library to create a bar chart from the dataset.

### 7. What is the importance of data cleaning in analysis?

**Answer:**

Data cleaning is crucial as it ensures the accuracy and reliability of the analysis. It involves removing errors, handling missing values, and standardizing data formats.

### 8. How can you identify outliers in a dataset?

**Answer:**

```
import seaborn as sns
sns.boxplot(x=data['column_name'])
```

**Explanation:**

This code snippet uses a boxplot to visualize and identify potential outliers in a specific column.

### 9. Explain the concept of correlation in statistics.

**Answer:**

Correlation measures the statistical relationship between two or more variables. It ranges from -1 to 1, where 1 indicates a strong positive correlation, -1 indicates a strong negative correlation, and 0 indicates no correlation.

### 10. How do you handle large datasets that do not fit into memory?

**Answer:**

You can use techniques like data chunking, parallel processing, or utilizing distributed computing frameworks like Apache Spark to handle large datasets efficiently.

### 11. What is the difference between a bar chart and a histogram?

**Answer:**

- A bar chart displays categorical data with rectangular bars, each representing a category.
- A histogram displays the distribution of a continuous numerical variable.

### 12. How do you perform data normalization?

**Answer:**

```
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler()
normalized_data = scaler.fit_transform(data[['column_name']])
```

**Explanation:**

The code snippet uses `MinMaxScaler`

to scale the values between 0 and 1, ensuring uniformity in the data range.

### 13. Explain the concept of one-hot encoding.

**Answer:**

One-hot encoding is a technique used to convert categorical data into a binary format. It creates binary columns for each category, marking the presence with ‘1’ and absence with ‘0’.

### 14. How do you identify multicollinearity in a regression analysis?

**Answer:**

```
import pandas as pd
from statsmodels.stats.outliers_influence import variance_inflation_factor
X = data[['feature1', 'feature2', 'feature3']]
vif_data = pd.DataFrame()
vif_data["Feature"] = X.columns
vif_data["VIF"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
```

**Explanation:**

This code snippet uses the Variance Inflation Factor (VIF) to detect multicollinearity between independent variables.

### 15. What is a time series analysis, and how is it useful?

**Answer:**

Time series analysis involves analyzing data points collected or recorded at specific time intervals. It’s useful for forecasting future trends based on historical patterns.

### 16. How do you select the right algorithm for a machine learning model?

**Answer:**

- It depends on the type of problem (classification, regression, etc.).
- Consider the size and nature of the dataset.
- Evaluate the assumptions of the algorithm.
- Perform cross-validation to assess model performance.

### 17. Explain the concept of A/B testing.

**Answer:**

A/B testing is a method of comparing two versions of a webpage or app against each other to determine which performs better in terms of user engagement or conversion rates.

### 18. How do you handle imbalanced datasets in machine learning?

**Answer:**

- Use techniques like resampling (oversampling minority class or undersampling majority class).
- Utilize algorithms designed for imbalanced data, like Random Forest or XGBoost.
- Apply techniques like Synthetic Minority Over-sampling Technique (SMOTE).

### 19. What is the purpose of a SQL join?

**Answer:**

A SQL join combines rows from two or more tables based on a related column between them, enabling the extraction of relevant information from multiple tables.

### 20. How do you optimize a SQL query for better performance?

**Answer:**

- Use indexes on columns frequently used in WHERE clauses.
- Avoid using SELECT *; only retrieve necessary columns.
- Optimize subqueries or use JOINs instead.

### 21. Scenario: Calculating Total Sales

**Question:**

Suppose you have a table named `orders`

with columns `order_id`

, `customer_id`

, and `amount`

. Write an SQL query to calculate the total sales for each customer.

**Answer:**

```
SELECT customer_id, SUM(amount) as total_sales
FROM orders
GROUP BY customer_id;
```

### 22. Scenario: Finding Duplicate Records

**Question:**

You suspect there are duplicate records in a table named `users`

based on the `email`

column. Write an SQL query to identify and list the duplicate email addresses.

**Answer:**

```
SELECT email, COUNT(*) as duplicate_count
FROM users
GROUP BY email
HAVING COUNT(*) > 1;
```

### 23. Scenario: Calculating Moving Average

**Question:**

Given a table `daily_sales`

with columns `date`

and `amount`

, write an SQL query to calculate the 7-day moving average of sales.

**Answer:**

```
SELECT date,
amount,
AVG(amount) OVER (ORDER BY date RANGE BETWEEN 6 PRECEDING AND CURRENT ROW) as moving_avg
FROM daily_sales;
```

### 24. Scenario: Finding Missing Values

**Question:**

You have a table named `products`

with columns `product_id`

, `product_name`

, and `price`

. Write an SQL query to find products with missing prices.

**Answer:**

```
SELECT product_id, product_name
FROM products
WHERE price IS NULL;
```

### 25. Scenario: Calculating Employee Tenure

**Question:**

Given a table `employees`

with columns `employee_id`

, `hire_date`

, and `termination_date`

, write an SQL query to calculate the tenure (in years) of each employee.

**Answer:**

```
SELECT employee_id,
hire_date,
termination_date,
DATEDIFF(termination_date, hire_date) as tenure_in_days
FROM employees;
```

### 26. Scenario: Identifying Outliers

**Question:**

Suppose you have a table `sales`

with columns `transaction_id`

and `amount`

. Write an SQL query to identify transactions that are outliers (considering a threshold, for example, three times the standard deviation).

**Answer:**

```
SELECT transaction_id, amount
FROM sales
WHERE ABS(amount - (SELECT AVG(amount) FROM sales)) > 3 * (SELECT STDEV(amount) FROM sales);
```

### 27. Scenario: Finding Nth Highest Salary

**Question:**

Write an SQL query to find the 3rd highest salary from a table `employees`

with columns `employee_id`

and `salary`

.

**Answer:**

```
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT 2, 1;
```

### 28. Scenario: Deleting Duplicate Records

**Question:**

You have a table `orders`

with columns `order_id`

, `customer_id`

, and `order_date`

. Write an SQL query to delete duplicate orders based on the combination of `customer_id`

and `order_date`

, keeping only the latest order.

**Answer:**

```
DELETE o1
FROM orders o1, orders o2
WHERE o1.customer_id = o2.customer_id
AND o1.order_date = o2.order_date
AND o1.order_id < o2.order_id;
```

### 29. Scenario: Finding the Mode

**Question:**

Given a table `survey_responses`

with columns `user_id`

and `response`

, write an SQL query to find the mode (most frequent response) for each user.

**Answer:**

```
SELECT user_id,
response,
COUNT(response) as frequency
FROM survey_responses
GROUP BY user_id, response
HAVING COUNT(response) = (SELECT MAX(frequency) FROM (SELECT user_id, response, COUNT(response) as frequency FROM survey_responses GROUP BY user_id, response) as temp WHERE temp.user_id = survey_responses.user_id);
```

### 30. Scenario: Finding Overlapping Date Ranges

**Question:**

You have a table `bookings`

with columns `booking_id`

, `start_date`

, and `end_date`

. Write an SQL query to find overlapping booking periods.

**Answer:**

```
SELECT b1.booking_id, b2.booking_id as overlapping_booking_id
FROM bookings b1, bookings b2
WHERE b1.booking_id <> b2.booking_id
AND b1.start_date <= b2.end_date
AND b1.end_date >= b2.start_date;
```

### 31. Scenario: Calculating Percentiles

**Question:**

Suppose you have a table `exam_scores`

with columns `student_id`

and `score`

. Write an SQL query to calculate the 75th percentile score.

**Answer:**

```
SELECT PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY score) as percentile_75
FROM exam_scores;
```

### 32. Scenario: Finding Longest Consecutive Sequence

**Question:**

Given a table `numbers`

with a column `num`

, write an SQL query to find the longest consecutive sequence of numbers.

**Answer:**

```
SELECT MIN(num) as start_num, MAX(num) as end_num
FROM (
SELECT num, ROW_NUMBER() OVER (ORDER BY num) - ROW_NUMBER() OVER (PARTITION BY num ORDER BY num) as grp
FROM numbers
) as temp
GROUP BY grp
ORDER BY COUNT(*) DESC
LIMIT 1;
```

### 33. Scenario: Calculating Running Total

**Question:**

Suppose you have a table `sales`

with columns `transaction_date`

and `amount`

. Write an SQL query to calculate the running total of sales.

**Answer:**

```
SELECT transaction_date,
amount,
SUM(amount) OVER (ORDER BY transaction_date) as running_total
FROM sales;
```

### 34. Scenario: Finding Islands of 1s in Binary Matrix

**Question:**

Given a binary matrix represented by a table `binary_matrix`

with columns `row_id`

, `col_id`

, and `value`

(0 or 1), write an SQL query to find the number of islands of 1s.

**Answer:**

```
SELECT COUNT(*) as num_islands
FROM (
SELECT DISTINCT a.row_id, a.col_id
FROM binary_matrix a
WHERE a.value = 1
AND NOT EXISTS (
SELECT 1
FROM binary_matrix b
WHERE b.value = 1
AND ABS(a.row_id - b.row_id) <= 1
AND ABS(a.col_id - b.col_id) <= 1
AND (a.row_id != b.row_id OR a.col_id != b.col_id)
)
) as islands;
```

### 35. Scenario: Calculating Moving Average

**Question:**

Given a table `sales`

with columns `transaction_date`

and `amount`

, write an SQL query to calculate the 3-day moving average of sales.

**Answer:**

```
SELECT transaction_date,
amount,
AVG(amount) OVER (ORDER BY transaction_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) as moving_avg
FROM sales;
```

### 36. Scenario: Identifying Duplicates

**Question:**

You have a table `employees`

with columns `employee_id`

and `name`

. Write an SQL query to find all duplicate names.

**Answer:**

```
SELECT name
FROM employees
GROUP BY name
HAVING COUNT(*) > 1;
```

### 37. Scenario: Finding Missing Numbers

**Question:**

Suppose you have a table `numbers`

with a column `num`

. Write an SQL query to find missing numbers in a range from 1 to 100.

**Answer:**

```
SELECT num+1 as missing_number
FROM numbers a
WHERE NOT EXISTS (
SELECT 1
FROM numbers b
WHERE b.num = a.num+1
)
AND num < 100;
```

### 38. Scenario: Calculating Age

**Question:**

Given a table `users`

with columns `user_id`

and `birth_date`

, write an SQL query to calculate the age of each user.

**Answer:**

```
SELECT user_id,
birth_date,
YEAR(CURRENT_DATE) - YEAR(birth_date) -
(CASE WHEN MONTH(CURRENT_DATE) < MONTH(birth_date)
OR (MONTH(CURRENT_DATE) = MONTH(birth_date) AND DAY(CURRENT_DATE) < DAY(birth_date)) THEN 1 ELSE 0 END) as age
FROM users;
```

### 39. Scenario: Finding Nth Highest Salary

**Question:**

You have a table `employees`

with columns `employee_id`

and `salary`

. Write an SQL query to find the second highest salary.

**Answer:**

```
SELECT MAX(salary) as second_highest_salary
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
```

### 40. Scenario: Generating Fibonacci Series

**Question:**

Write an SQL query to generate the first 10 numbers of the Fibonacci series.

**Answer:**

```
WITH RECURSIVE fibonacci(n, a, b) AS (
SELECT 1, 0, 1
UNION ALL
SELECT n+1, b, a+b FROM fibonacci WHERE n < 10
)
SELECT a as fibonacci_number FROM fibonacci;
```

### 41. Scenario: Finding Duplicate Records

**Question:**

In a table `orders`

, there is a column `order_id`

. Write an SQL query to find all orders that have duplicate `order_id`

entries.

**Answer:**

```
SELECT order_id
FROM orders
GROUP BY order_id
HAVING COUNT(*) > 1;
```

### 42. Scenario: Updating Records

**Question:**

Suppose you have a table `products`

with columns `product_id`

and `price`

. Write an SQL query to increase the price of all products by 10%.

**Answer:**

```
UPDATE products
SET price = price * 1.1;
```

### 43. Scenario: Finding Active Users

**Question:**

Given a table `users`

with columns `user_id`

and `last_login`

, write an SQL query to find users who have logged in within the last 30 days.

**Answer:**

```
SELECT user_id, last_login
FROM users
WHERE last_login >= DATE_SUB(CURRENT_DATE, INTERVAL 30 DAY);
```

### 44. Scenario: Calculating Total Sales

**Question:**

You have a table `sales`

with columns `transaction_id`

and `amount`

. Write an SQL query to calculate the total sales amount.

**Answer:**

```
SELECT SUM(amount) as total_sales
FROM sales;
```

### 45. Scenario: Finding the Longest String

**Question:**

Suppose you have a table `strings`

with a column `text`

. Write an SQL query to find the longest string.

**Answer:**

```
SELECT text
FROM strings
ORDER BY LENGTH(text) DESC
LIMIT 1;
```

### 46. Scenario: Calculating Median

**Question:**

Given a table `scores`

with a column `score`

, write an SQL query to calculate the median score.

**Answer:**

```
SELECT AVG(score) as median_score
FROM (
SELECT score
FROM scores
ORDER BY score
LIMIT 2 - (SELECT COUNT(*) FROM scores) % 2
OFFSET (SELECT (COUNT(*) - 1) / 2 FROM scores)
) subquery;
```

### 47. Scenario: Finding Anagrams

**Question:**

You have a table `words`

with a column `word`

. Write an SQL query to find all pairs of anagrams.

**Answer:**

```
SELECT a.word as word1, b.word as word2
FROM words a, words b
WHERE a.word != b.word
AND LENGTH(a.word) = LENGTH(b.word)
AND SORTED_STRING(a.word) = SORTED_STRING(b.word);
```

(Note: `SORTED_STRING`

is a user-defined function that returns the input string with its characters sorted alphabetically.)

### 48. Scenario: Calculating Running Total

**Question:**

Given a table `orders`

with columns `order_id`

and `amount`

, write an SQL query to calculate the running total of the `amount`

column, ordered by `order_id`

.

**Answer:**

```
SELECT order_id, amount,
SUM(amount) OVER (ORDER BY order_id) as running_total
FROM orders;
```

### 49. Scenario: Finding Nth Highest Salary

**Question:**

In a table `employees`

, there is a column `salary`

. Write an SQL query to find the Nth highest salary.

**Answer:**

```
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT N-1, 1;
```

### 50. Scenario: Pivot Table

**Question:**

You have a table `sales`

with columns `month`

, `product`

, and `revenue`

. Write an SQL query to create a pivot table where rows represent months, columns represent products, and the values are the total revenue.

**Answer:**

```
SELECT *
FROM (
SELECT month, product, revenue
FROM sales
)
PIVOT (
SUM(revenue)
FOR product IN ('Product1' as Product1, 'Product2' as Product2, 'Product3' as Product3)
) as PivotTable;
```

### 51. Scenario: Finding Missing Numbers

**Question:**

Given a table `numbers`

with a column `num`

, which contains integers from 1 to N (where some numbers are missing), write an SQL query to find the missing numbers.

**Answer:**

```
WITH RECURSIVE NumbersCTE AS (
SELECT 1 as num
UNION ALL
SELECT num + 1
FROM NumbersCTE
WHERE num < (SELECT MAX(num) FROM numbers)
)
SELECT num
FROM NumbersCTE
WHERE num NOT IN (SELECT num FROM numbers);
```

### 52. Scenario: Calculating Age

**Question:**

You have a table `users`

with a column `birth_date`

. Write an SQL query to calculate the age of each user.

**Answer:**

```
SELECT user_id,
DATEDIFF(CURRENT_DATE, birth_date)/365 as age
FROM users;
```

### 53. Scenario: Recursive Hierarchy

**Question:**

Suppose you have a table `employees`

with columns `employee_id`

and `manager_id`

representing a hierarchical structure. Write an SQL query to retrieve the entire hierarchy for a given manager.

**Answer:**

```
WITH RECURSIVE Hierarchy AS (
SELECT employee_id, manager_id
FROM employees
WHERE manager_id = <given_manager_id>
UNION ALL
SELECT e.employee_id, e.manager_id
FROM employees e
JOIN Hierarchy h ON e.manager_id = h.employee_id
)
SELECT * FROM Hierarchy;
```

### 54. Scenario: Finding Duplicate Rows

**Question:**

In a table `students`

, some students have identical names and ages. Write an SQL query to find all the duplicate rows based on the `name`

and `age`

columns.

**Answer:**

```
SELECT name, age, COUNT(*) as duplicate_count
FROM students
GROUP BY name, age
HAVING COUNT(*) > 1;
```

### 55. Scenario: Concatenating Strings

**Question:**

You have a table `employees`

with columns `first_name`

and `last_name`

. Write an SQL query to create a new column `full_name`

that concatenates the first and last names with a space in between.

**Answer:**

```
SELECT first_name, last_name, CONCAT(first_name, ' ', last_name) as full_name
FROM employees;
```

### 56. Scenario: Finding Average Value

**Question:**

Given a table `scores`

with a column `score`

, write an SQL query to find the average score.

**Answer:**

```
SELECT AVG(score) as average_score
FROM scores;
```

### 57. Scenario: Calculating Percentile

**Question:**

You have a table `test_scores`

with a column `score`

. Write an SQL query to calculate the 90th percentile score.

**Answer:**

```
SELECT PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY score) as percentile_90
FROM test_scores;
```

### 58. Scenario: Finding Top N Records

**Question:**

In a table `products`

, there is a column `price`

. Write an SQL query to retrieve the top 10 most expensive products.

**Answer:**

```
SELECT product_name, price
FROM products
ORDER BY price DESC
LIMIT 10;
```

### 59. Scenario: Grouping by Time Intervals

**Question:**

You have a table `sales`

with a column `sale_date`

and `amount`

. Write an SQL query to group sales by month and calculate the total sales for each month.

**Answer:**

```
SELECT DATE_FORMAT(sale_date, '%Y-%m') as month, SUM(amount) as total_sales
FROM sales
GROUP BY month;
```

### 60. Scenario: Finding Overlapping Dates

**Question:**

Given a table `appointments`

with columns `start_date`

and `end_date`

, write an SQL query to find all appointments that overlap with a specific date range.

**Answer:**

```
SELECT *
FROM appointments
WHERE start_date <= <end_date> AND end_date >= <start_date>;
```

### 61. Scenario: Calculating Daily Growth

**Question:**

In a table `user_activity`

, there is a column `users`

representing the number of daily active users. Write an SQL query to calculate the daily growth in the number of users.

**Answer:**

```
SELECT date, users,
users - LAG(users) OVER (ORDER BY date) as daily_growth
FROM user_activity;
```

### 62. Scenario: Finding Closest Points

**Question:**

Given a table `locations`

with columns `latitude`

and `longitude`

, write an SQL query to find the closest locations to a specific point with given latitude and longitude.

**Answer:**

```
SELECT location_name, latitude, longitude,
SQRT(POW(latitude - <given_latitude>, 2) + POW(longitude - <given_longitude>, 2)) as distance
FROM locations
ORDER BY distance
LIMIT 10;
```

### 63. Scenario: Calculating Running Total

**Question:**

In a table `orders`

, there is a column `order_date`

and `order_amount`

. Write an SQL query to calculate the running total of order amounts.

**Answer:**

```
SELECT order_date, order_amount,
SUM(order_amount) OVER (ORDER BY order_date) as running_total
FROM orders;
```

### 64. Scenario: Handling NULL Values

**Question:**

You have a table `employees`

with columns `employee_id`

, `first_name`

, and `last_name`

. Write an SQL query to display the full names of employees. If the last name is NULL, it should display “Unknown” instead.

**Answer:**

```
SELECT employee_id,
CONCAT(first_name, ' ', COALESCE(last_name, 'Unknown')) as full_name
FROM employees;
```

### 65. Scenario: Finding Nth Highest Value

**Question:**

Given a table `scores`

with a column `score`

, write an SQL query to find the 3rd highest score.

**Answer:**

```
SELECT DISTINCT score
FROM scores
ORDER BY score DESC
LIMIT 2, 1;
```

### 66. Scenario: Calculating Age

**Question:**

In a table `users`

, there is a column `birth_date`

. Write an SQL query to calculate the age of each user.

**Answer:**

```
SELECT user_name, birth_date,
YEAR(CURRENT_DATE) - YEAR(birth_date) -
(DATE_FORMAT(CURRENT_DATE, '%m%d') < DATE_FORMAT(birth_date, '%m%d')) as age
FROM users;
```

### 67. Scenario: Calculating Median Value

**Question:**

You have a table `test_scores`

with a column `score`

. Write an SQL query to calculate the median score.

**Answer:**

```
SELECT
AVG(score) as median_score
FROM
(SELECT score FROM test_scores ORDER BY score LIMIT 2 - (SELECT COUNT(*) FROM test_scores) % 2, (SELECT COUNT(*)+1 FROM test_scores) / 2) as tbl;
```

### 68. Scenario: Retrieving Random Rows

**Question:**

Write an SQL query to retrieve a random row from a table `products`

.

**Answer:**

```
SELECT *
FROM products
ORDER BY RAND()
LIMIT 1;
```

### 69. Scenario: Calculating Cumulative Percentage

**Question:**

Given a table `sales`

with columns `product_name`

and `sales_amount`

, write an SQL query to calculate the cumulative percentage of total sales for each product.

**Answer:**

```
SELECT product_name, sales_amount,
(SUM(sales_amount) OVER (ORDER BY sales_amount DESC) /
(SELECT SUM(sales_amount) FROM sales)) * 100 as cumulative_percentage
FROM sales;
```

### 70. Scenario: Finding Missing Numbers

**Question:**

In a table `numbers`

, some numbers are missing. Write an SQL query to find the missing numbers between a specific range.

**Answer:**

```
WITH RECURSIVE missing_numbers AS (
SELECT <start_number> as number
UNION
SELECT number + 1 FROM missing_numbers WHERE number < <end_number>
)
SELECT number
FROM missing_numbers
WHERE number NOT IN (SELECT number FROM numbers);
```

### 71. Scenario: Finding Duplicate Records

**Question:**

Given a table `students`

with columns `student_id`

and `name`

, write an SQL query to find duplicate records based on the `name`

.

**Answer:**

```
SELECT name, COUNT(*) as count
FROM students
GROUP BY name
HAVING count > 1;
```

### 72. Scenario: Calculating Moving Average

**Question:**

You have a table `temperature_readings`

with columns `reading_date`

and `temperature`

. Write an SQL query to calculate the 7-day moving average of temperatures.

**Answer:**

```
SELECT reading_date, temperature,
AVG(temperature) OVER (ORDER BY reading_date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as moving_average
FROM temperature_readings;
```

### 73. Scenario: Finding Longest Consecutive Sequence

**Question:**

In a table `numbers`

, there is a column `num`

. Write an SQL query to find the longest consecutive sequence of numbers.

**Answer:**

```
SELECT MIN(num) as start_num, MAX(num) as end_num
FROM (
SELECT num,
num - ROW_NUMBER() OVER (ORDER BY num) as grp
FROM numbers
) as t
GROUP BY grp
ORDER BY COUNT(*) DESC
LIMIT 1;
```

### 74. Scenario: Generating Rank within Groups

**Question:**

Given a table `sales`

with columns `product_name`

and `sales_amount`

, write an SQL query to calculate the rank of products based on sales within each category.

**Answer:**

```
SELECT product_name, sales_amount,
DENSE_RANK() OVER (PARTITION BY product_category ORDER BY sales_amount DESC) as rank_within_category
FROM sales;
```

### 75. Scenario: Finding Latest Record in Each Group

**Question:**

You have a table `customer_orders`

with columns `order_id`

, `customer_id`

, and `order_date`

. Write an SQL query to find the latest order for each customer.

**Answer:**

```
SELECT customer_id, MAX(order_date) as latest_order_date
FROM customer_orders
GROUP BY customer_id;
```

### 76. Scenario: Calculating Time Difference

**Question:**

In a table `log_events`

, there are columns `event_type`

and `event_timestamp`

. Write an SQL query to calculate the time difference between consecutive events of the same type.

**Answer:**

```
SELECT event_type, event_timestamp,
TIMEDIFF(event_timestamp,
LAG(event_timestamp) OVER (PARTITION BY event_type ORDER BY event_timestamp)) as time_diff
FROM log_events;
```

### 77. Scenario: Finding Common Elements

**Question:**

Given two tables `table1`

and `table2`

with columns `element`

, write an SQL query to find the common elements between the two tables.

**Answer:**

```
SELECT element
FROM table1
INTERSECT
SELECT element
FROM table2;
```

### 78. Scenario: Calculating Exponential Moving Average (EMA)

**Question:**

You have a table `stock_prices`

with columns `price_date`

and `stock_price`

. Write an SQL query to calculate the 10-day exponential moving average of stock prices.

**Answer:**

```
SELECT price_date, stock_price,
AVG(stock_price) OVER (ORDER BY price_date
ROWS BETWEEN 9 PRECEDING AND CURRENT ROW) as ema_10_day
FROM stock_prices;
```

### 79. Scenario: Finding Nth Highest Salary

**Question:**

Write an SQL query to find the nth highest salary from a table `employees`

.

**Answer:**

```
SELECT DISTINCT salary
FROM employees
ORDER BY salary DESC
LIMIT (n-1), 1;
```

### 80. Scenario: Calculating Total Sales by Month

**Question:**

Given a table `sales`

with columns `sale_date`

and `amount`

, write an SQL query to calculate the total sales amount for each month.

**Answer:**

```
SELECT DATE_FORMAT(sale_date, '%Y-%m') as month, SUM(amount) as total_sales
FROM sales
GROUP BY month;
```

### 81. Scenario: Calculating Running Total

**Question:**

You have a table `orders`

with columns `order_date`

and `order_amount`

. Write an SQL query to calculate the running total of order amounts.

**Answer:**

```
SELECT order_date, order_amount,
SUM(order_amount) OVER (ORDER BY order_date) as running_total
FROM orders;
```

### 82. Scenario: Finding Missing Numbers

**Question:**

In a table `numbers`

, some numbers are missing. Write an SQL query to find the missing numbers in a given range.

**Answer:**

```
SELECT DISTINCT n1 + 1 as missing_number
FROM numbers n1, numbers n2
WHERE n1 + 1 NOT IN (SELECT num FROM numbers)
AND n1 < n2
AND n1 + 1 <= (SELECT MAX(num) FROM numbers);
```

### 83. Scenario: Finding Second Highest Salary

**Question:**

Write an SQL query to find the second highest salary from a table `employees`

.

**Answer:**

```
SELECT MAX(salary) as second_highest_salary
FROM employees
WHERE salary < (SELECT MAX(salary) FROM employees);
```

### 84. Scenario: Calculating Age

**Question:**

Given a table `users`

with a column `date_of_birth`

, write an SQL query to calculate the age of each user.

**Answer:**

```
SELECT user_id, name, date_of_birth,
YEAR(CURDATE()) - YEAR(date_of_birth) -
(DATE_FORMAT(CURDATE(), '%m%d') < DATE_FORMAT(date_of_birth, '%m%d')) as age
FROM users;
```

### 85. Scenario: Finding Top N Records

**Question:**

You have a table `products`

with columns `product_name`

and `sales`

. Write an SQL query to find the top 5 products by sales.

**Answer:**

```
SELECT product_name, sales
FROM products
ORDER BY sales DESC
LIMIT 5;
```

### 86. Scenario: Finding Duplicate Records

**Question:**

Given a table `students`

with columns `student_id`

and `name`

, write an SQL query to find duplicate records.

**Answer:**

```
SELECT student_id, name
FROM students
GROUP BY student_id, name
HAVING COUNT(*) > 1;
```

### 87. Scenario: Calculating Percentiles

**Question:**

You have a table `test_scores`

with columns `student_id`

and `score`

. Write an SQL query to calculate the 90th percentile of scores.

**Answer:**

```
SELECT PERCENTILE_DISC(0.9) WITHIN GROUP (ORDER BY score) as percentile_90
FROM test_scores;
```

### 88. Scenario: Finding Uncommon Elements

**Question:**

Given two tables `table1`

and `table2`

with a common column `id`

, write an SQL query to find elements that are present in either `table1`

or `table2`

, but not in both.

**Answer:**

```
SELECT id
FROM table1
WHERE id NOT IN (SELECT id FROM table2)
UNION ALL
SELECT id
FROM table2
WHERE id NOT IN (SELECT id FROM table1);
```

### 89. Scenario: Calculating Median

**Question:**

In a table `scores`

with column `score`

, write an SQL query to calculate the median of scores.

**Answer:**

```
SELECT AVG(score) as median
FROM (
SELECT score
FROM scores
ORDER BY score
LIMIT 2 - MOD((SELECT COUNT(*) FROM scores), 2), 1
) as subquery;
```

### 90. Scenario: Finding Cycle in a Directed Graph

**Question:**

Given a table `edges`

with columns `source_node`

and `target_node`

, write an SQL query to find if there’s a cycle in the directed graph.

**Answer:**

```
SELECT DISTINCT e1.source_node, e1.target_node
FROM edges e1, edges e2
WHERE e1.target_node = e2.source_node
AND e1.source_node = e2.target_node;
```

### 91. Scenario: Calculating Moving Average

**Question:**

You have a table `sales`

with columns `sale_date`

and `amount`

. Write an SQL query to calculate the 3-day moving average of sales.

**Answer:**

```
SELECT sale_date, amount,
AVG(amount) OVER (ORDER BY sale_date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) as moving_avg
FROM sales;
```

### 92. Scenario: Finding Prime Numbers

**Question:**

Write an SQL query to find all prime numbers less than or equal to `n`

.

**Answer:**

```
WITH RECURSIVE primes AS (
SELECT 2 as num
UNION
SELECT num + 1
FROM primes
WHERE NOT EXISTS (
SELECT 1
FROM primes p
WHERE p.num > 1 AND p.num < num AND num % p.num = 0
)
AND num <= n
)
SELECT num FROM primes;
```

### 93. Scenario: Calculating Cumulative Sum

**Question:**

You have a table `orders`

with columns `order_id`

, `customer_id`

, and `order_total`

. Write an SQL query to calculate the cumulative sum of `order_total`

for each customer.

**Answer:**

```
SELECT order_id, customer_id, order_total,
SUM(order_total) OVER (PARTITION BY customer_id ORDER BY order_id) as cumulative_total
FROM orders;
```

### 94. Scenario: Finding Missing Numbers

**Question:**

Given a table `numbers`

with a column `num`

, containing numbers from 1 to `n`

, write an SQL query to find the missing numbers.

**Answer:**

```
WITH RECURSIVE missing_nums AS (
SELECT 1 as num
UNION
SELECT num + 1
FROM missing_nums
WHERE num < n
)
SELECT num
FROM missing_nums
LEFT JOIN numbers ON missing_nums.num = numbers.num
WHERE numbers.num IS NULL;
```

### 95. Scenario: Hierarchical Data

**Question:**

You have a table `employees`

with columns `employee_id`

, `name`

, and `manager_id`

(where `manager_id`

points to `employee_id`

). Write an SQL query to retrieve the hierarchical structure.

**Answer:**

```
WITH RECURSIVE org_chart AS (
SELECT employee_id, name, manager_id, 0 as level
FROM employees
WHERE manager_id IS NULL
UNION ALL
SELECT e.employee_id, e.name, e.manager_id, oc.level + 1
FROM employees e
JOIN org_chart oc ON e.manager_id = oc.employee_id
)
SELECT *
FROM org_chart;
```

### 96. Scenario: Generating Random Data

**Question:**

Write an SQL query to generate 10 random numbers between 1 and 100.

**Answer:**

```
SELECT ROUND(RANDOM() * 99 + 1)::INT as random_number
FROM generate_series(1, 10);
```

### 97. Scenario: Finding Longest Consecutive Sequence

**Question:**

Given a table `numbers`

with a column `num`

, write an SQL query to find the longest consecutive sequence of numbers.

**Answer:**

```
SELECT MIN(num) as start_range, MAX(num) as end_range
FROM (
SELECT num, ROW_NUMBER() OVER (ORDER BY num) - ROW_NUMBER() OVER (PARTITION BY num ORDER BY num) as grp
FROM numbers
) as subquery
GROUP BY grp
ORDER BY COUNT(*) DESC
LIMIT 1;
```

### 98. Scenario: Calculating Fibonacci Sequence

**Question:**

Write an SQL query to generate the first 10 numbers in the Fibonacci sequence.

**Answer:**

```
WITH RECURSIVE fibonacci AS (
SELECT 0 as n, 0 as fib
UNION ALL
SELECT 1 as n, 1 as fib
UNION ALL
SELECT n + 1, fib + (LAG(fib) OVER (ORDER BY n))
FROM fibonacci
WHERE n < 9
)
SELECT fib
FROM fibonacci;
```

### 99. Scenario: Calculating Moving Averages

**Question:**

You have a table `sales`

with columns `date`

and `amount`

. Write an SQL query to calculate the 7-day moving average of sales.

**Answer:**

```
SELECT date, amount,
AVG(amount) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) as moving_avg
FROM sales;
```

### 100. Scenario: Finding Duplicate Records

**Question:**

Given a table `users`

with columns `user_id`

and `email`

, write an SQL query to find and list all the duplicate email addresses.

**Answer:**

```
SELECT email, COUNT(*) as duplicate_count
FROM users
GROUP BY email
HAVING COUNT(*) > 1;
```