Top 70 Splunk Interview Questions and Answers in 2021

Splunk Inc. is an American technology company that is based in San Francisco, California, which produces software for monitoring, searching, and analyzing machine-generated data through a Web-style interface.

Table of Contents

What does Splunk software do?

Splunk captures, indexes, and correlates the real-time data in a searchable repository from where it can generate the graphs, alerts, reports, visualizations, and dashboards. Splunk is usually defined as a horizontal technology that is used for application management, security, and compliance, business and web analytics.

Are you planning to attend a Splunk Interview, then you have to prepare yourself before attending the interview so that you can crack it easily. Make sure you go through our Splunk Interview Questions and Answers, which might be of some help to you.

Top Splunk Interview Questions and Answers

1. Can you define Splunk in simple words?

Splunk can be defined as a software platform that enables users to analyze machine-generated data. Splunk is mainly used for searching, monitoring, visualizing, and reporting enterprise data. It analyzes and processes the machine data and converts them into powerful operational intelligence by offering some real-time insights.

2. Explain the components of Splunk Architecture?

Splunk interview questions - Splunk Architecture

The main components in Splunk architecture are the forwarder, indexer, and search head.

Splunk Forwarder: The forwarder is an agent that you deploy on an IT system, which will collect logs and sends them to the indexer. Splunk generally has two types of forwarders:

  • Universal Forwarder: It forwards the raw data without prior treatment. It is faster and requires fewer resources on the host, but it results in huge quantities of data that are sent to the indexer.
  • Heavy Forwarder: It performs indexing and parsing at the source, on the host machine, and it sends only parsed events to an indexer.

Splunk Indexer: The indexer mainly transforms the data into events, stores it to the disk, and adds to an index,  and enables searchability.

The indexer will create the following mentioned files, and it separates them into directories known as buckets:

  • Compressed raw data
  • Indexes pointing to raw data (.TSIDX files)
  • Metadata files

Splunk Search Head: It enables the UI users to interact with Splunk. It allows the users to query and search the Splunk data and interfaces with indexers to gain access to the specific data that they request.

3. Why we use Splunk to analyze machine data?

  1. E2E visibility – With the machine data Splunk gains end-to-end visibility across the operations, and then it breaks down across the infrastructure.
  2. Business Decisions – Splunk learns the patterns, trends, and then it gains operational intelligence from machine data and helps in making smarter business decisions.
  3. Explore & Examine – With the machine data, Splunk will find the problems and then correlate the events across various data sources, and implicitly it detects patterns across huge sets of data.
  4. Upfront Servers Monitoring – Here, It uses the machine data to monitor the systems, and it helps in identifying the problems, issues, and attacks.

4. Can you list the common port numbers used by Splunk?

The common port numbers for Splunk are listed below:

  1. Splunk Web Port: 8000
  2. Splunk Management Port: 8089
  3. Splunk Network port: 514
  4. Splunk Index Replication Port: 8080
  5. Splunk Indexing Port: 9997
  6. KV store: 8191

5. List the advantages of Splunk?

The advantages of Splunk are:

  1. It creates analytical reports with interactive graphs, charts, and tables, and It shares them with others which will be productive for users.
  2. It is scalable, and it is easy to implement.
  3. It automatically finds useful information that is enclosed in the data so that you don’t have to identify it by yourself.
  4. It also helps in saving the searches and tags that are recognized as important data so that it can make your system even smarter.
See also  Top 100 Hive Interview Questions And Answers

Splunk Interview Questions and Answers

6. List the disadvantages of Splunk?

The disadvantages are:

  1. It is expensive for huge data volumes.
  2. We cannot practically implement it because optimizing searches for speed is more of philosophy than that of science.
  3. Dashboards are, in fact, useful, but they are not as reliable as Tableau.
  4. The IT sector is trying to replace Splunk with other new open-source options, which is also a challenge faced by Splunk.

7. List some important features of Splunk?

Some of the features are:

  1. Accelerate Development & Testing
  2. It allows us to build Real-time Data Applications
  3. It can generate faster ROI.
  4. Agile statistics and reporting with Real-time architecture;
  5. They offer search, visualization, and analysis capabilities to empower users of different types.

8. List the different Splunk products?

Splunk is usually available in three different versions, they are:

  1. Splunk Enterprise: This edition is mainly used by large IT businesses. It helps us to gather and analyze data from websites, applications, etc.
  2. Splunk Light: Splunk Cloud is a hosted platform. It has similar features as that of the enterprise version. It is available from Splunk or by using the AWS cloud platform.
  3. Splunk Cloud: It is a free version. It enables search, report, and alters the log data. It has limited features and functionalities compared to other versions.

9. List the different types of search modes supported in Splunk?

Splunk supports mainly three types of dashboards:

  1. Fast mode
  2. Smart mode
  3. Verbose mode

10. Can you explain how Splunk works?

Splunk agents are also known as a forwarder; they are installed or deployed on application servers which will collect the data from source and forward them to the Indexer.

Next, the  Indexer stores these data locally, and it is based on the license capacity in a host machine or on the cloud.

Post to these setups, the  Search Head is used for searching, analyzing, visualizing, and performing different other functions on the data that are stored in the Indexer.

Splunk Interview Questions and Answers

11. Do we have any benefits of feeding data into a Splunk instance through Splunk Forwarders?

When you feed data into a Splunk instance through the Splunk Forwarders, we have three significant benefits.

  • TCP connection
  • Bandwidth throttling
  • It has an encrypted SSL connection that transfers data from a Forwarder to the  Indexer. 

Splunk’s architecture is designed such that the data that is forwarded to the Indexer is load-balanced by default.

So, even if one of the Indexers goes down due to some reason, the data will re-route itself through another Indexer instance fastly. Further, Splunk Forwarders will cache the events locally before forwarding them, and hence they create a temporary backup of the data.

12. What do we use License Master in Splunk?

The license master in Splunk controls one or multiple license slaves. From the license master, we can define pools, stacks, add licensing capacity, and we can manage the license slaves. It also ensures that the right amount of data gets indexed. 

13. Differentiate between Splunk and Tableau?

SplunkTableau
It is related to the machine data that is obtained from data centers, ATMs, mobile devices, security devices.It helps customers to take decision-based upon past data.
It supports only web-based.It supports web-based, android apps, and iPhone apps.
Its customers are Bosch, John Lewis, Amaya, Baylor University, NPRIts customers are Deloitte, Pandora, Citrix

14. Define Summary Index in Splunk?

Summary indexing lets us run fast searches over huge data sets by spreading out the cost of computationally expensive reports over time. In order to achieve this, the search that populates the summary index has to runs on a frequent, recurring basis, and It should extract the specific data that is required.

15. What will happens if the License Master is not reachable?

In cases the license master is not reachable, then it is not possible to search the data. But, the data coming into Indexer will not be affected. The data continues to flow into the Splunk deployment.

As usual, the Indexers will continue to index the data. But, we will get a warning message on web UI which says that you have exceeded the indexing volume and you need to reduce the amount of data that is coming in, or you have to buy a higher capacity of license.

Splunk Interview Questions and Answers

16. Define Splunk alert?

Splunk alerts are defined as the actions that get triggered when a given criterion is met that is defined by the user. The main aim of alerts is to log an action, send an email, or output a result to a lookup file, etc.

17. Why do we need Splunk DB Connect?

Splunk DB Connect allows us to import tables, columns, and rows from the database directly into the Splunk Enterprise, which will index the data. Then we can analyze and visualize that relational data from within the Splunk Enterprise just as you would do with the rest of the Splunk Enterprise data.

18. Define license violation from Splunk’s perspective?

In Splunk, the license violation warning states that Splunk has indexed more data than the purchased license quota. We need to identify which index or source type has received more data recently than the daily data volume.

19. Define SF and RF?

The Search factor(SF) specifies the number of searchable copies of data that the indexer cluster maintains. In other words, the SF determines the number of searchable copies of each bucket. The default value for the SF is 2, which means that the cluster usually maintains two searchable copies of all data.

The Replication factor(RF) specifies the number of peers that will receive the copies of data. The search factor specifies the number of peers that index the data. A peer node indexes external data, and it simultaneously stores the potential indexing copies of replicated data that are sent to it from other peers.

20. List a few Splunk search commands?

  1. Abstract
  2. Erex
  3. Add totals
  4. Accum
  5. Fill down
  6. Typer
  7. Rename
  8. Anomalies

Splunk Interview Questions and Answers

21. List a few use cases of Knowledge objects?

Knowledge objects are used in domains like:

Network Security: We can increase the security in systems by blacklisting some IPs from getting into the network. This is done by using a Knowledge object called lookups.

Application Monitoring: By using the knowledge objects, one can monitor the applications in real-time and can configure alerts which will then notify us when the application crashes.

Physical Security: If the organization had to deal with physical security, then we can leverage the data that contains the information about flooding, earthquakes, volcanoes, etc. to gain valuable insights

Employee Management: If you have to monitor the activities of people who are serving the notice period, then we can create a list of the people, and we can create a rule to prevent them from copying information and using them outside.

22. List the basic commands that are included in the filtering results category in Splunk?

Here is the list of few commands that are used while filtering the result:

  1. Where: The EVAL expression is used by the WHERE command to filter searched results from an extracted event. We use the WHERE command to deep dive into the searched results.
  2. Sort– If we want the result to be sorted by certain fields, then we use the SORT command, which can sort the result in ascending or descending order. And the capacity of the sorting can also be defined with this command.
  3. Search- We use this command to retrieve the events from the indexes. Events from the indexes can be searched using keywords, Key, Value, quoted phrases, and wildcards.
  4. Rex– it is a regular expression that helps the user to extract the data/exact field from events that are generated. To get this info, we use the  REX command.
See also  Top 100 Hive Interview Questions And Answers

23. Can you tell us the different options while setting up Alerts?

The different options that are available while setting up alerts are given below:

  1. You can create a webhook so that we can write to a chat or Github. Here, we write an email to the group of machines with all the subjects, the body of the message, and priorities.
  2. One can add results, pdf or CSV, or inline with the body of the message to make the recipient understand where actually this alert has been fired, at what conditions, and what is action he has taken.
  3. We can also create tickets and throttle alerts based on specific conditions such as machine name or an IP address. For example, if there is any virus outbreak, we don’t want each and every alert to be triggered as it will lead to many tickets that are created in the system, which results in overload. We can control these types of alerts from the alert window.

24. Write the general expression for extracting IP addresses from logs?

One can extract the IP address from logs in multiple ways, but the regular expression would be:

rex field=_raw “(?<ip_address>([0-9]{1,3}[\.]){3}[0-9]{1,3})” or rex field=_raw “(?<ip_address>\d+\.\d+\.\d+\.\d+)”

Splunk Interview Questions and Answers

25. Can you explain Workflow Actions?

Once we have assigned rules, we generate reports and schedules. You have to generate workflow actions that will automate certain tasks. For example:

  1. One can double click, which then performs a drill down into a specific list that contains user names and the IP addresses, and we can also perform further search into that list.
  2. We double-click to retrieve the user name from the report and pass that as a parameter to the next reports.
  3. We use the workflow actions to retrieve data and also send data to other fields. For example, we pass latitude and longitude details to the google map, and then we find where an IP address or location exists.

26. Differentiate between Charts, Stats, and Timecharts commands?

Stats: It is a reporting command; here, we use multiple fields to create a table.

Chart: It will display the search result in a bar format or line form, or area graph.

Time chart: Iy usually takes only one field because the x-axis is a Time field.

27. How to troubleshoot Splunk performance issues?

To troubleshoot Splunk performance issues, follow the given steps:

  1. You need to use the Splunk Secure Gateway for troubleshooting dashboards.
  2. You need to check for server performance issues (CPU/memory usage, disk i/o, etc.)
  3. Check in the splunkd.log to find any errors.
  4. Check for the number of saved searches that are running currently and also check their system resource consumption.
  5. You need to install Firebug and enable it in the system.

28. Define the terms Data Models and Pivot?

Data models are mainly used to create a structured hierarchical model of the data. It is used when you have a huge amount of unstructured data, and you want to make use of that data without making use of complex search queries.

A few uses of Data models are:

  1. Creating Sales Reports
  2. To set the access Levels
  3. To enable Authentication

With Pivots, we have the flexibility to create front views of the results and then pick and select the most appropriate filter for a good view of results. 

29. Define the terms Lookup command and Inputlookup command?

We use the Lookup commands to receive some fields from an external file like a CSV file or any python based script to get a certain value of an event. It narrows the search results because it helps to reference fields in an external CSV file that will match the fields in the event data.

An Inputlookup takes input as the name suggests. For example, it will take the product name, product price as the input, and then it matches with an internal field such as a product id or an item id. 

An Outputlookup generates an output from the existing field list. In simple terms, the Inputlookup enriches the data, and The Outputlookup builds the information.

30. Define Buckets and explain the Splunk Bucket Lifecycle?

Splunk Bucket Lifecycle

Buckets can be defined as directories that are used to store the indexed data in Splunk. 

A bucket undergoes several phases of transformation over time. They are:

  1. Hot – This bucket consists of the newly indexed data, and it is open for writing and new additions. An index can have one or multiple hot buckets. 
  2. Warm – This bucket consists of the data that is rolled out from a hot bucket. 
  3. Cold – This bucket usually has data that is rolled out from a warm bucket. 
  4. Frozen – This bucket consists of the data that is rolled out from a cold bucket. The Splunk Indexer will delete the frozen data by default. But, we have an option to archive it. And one important thing here is that frozen data is not searchable.

Splunk Interview Questions and Answers

31. Can you list the default fields for events in Splunk?

  1. host
  2. source
  3. source type
  4. index
  5. timestamp

32. Why do we need Time Zone property to serve in Splunk?

Time Zone in Splunk is crucial for searching the events from a security or fraud perspective. Splunk will set the default Time Zone for us from the browser settings. Then, the browser picks up the current Time Zone from the machine that we are using.

So, when we search for an event with the wrong Time Zone, Then, we do not find anything relevant for that specific search. It is extremely important when we are searching and correlating the data pouring in multiple sources. 

33. Explain the concept of file precedence in Splunk?

In Splunk, the File precedence is one of the important aspects of troubleshooting for an administrator, developer, and architect. All the Splunk configurations are usually written within the plain text .conf files.

There can be various copies present for each of the files, and it is important for us to know the type of role these files play when a Splunk instance is restarted or running. 

34. How can we extract fields in Splunk?

  1. One can extract the fields from event lists, sidebar, or from the settings menu through the UI.
  2. Another way is to write the regular expressions of your own in props.conf configuration file.

35. Explain the Source type in Splunk?

The Source type in Splunk will determine how the Splunk software processes incoming data streams into the individual events according to data nature.

36. Differentiate between Search time and Index time field extractions?

Search TimeIndex Time
It usually takes place when you search through the data. Splunk creates the fields while compiling the search results, and it does not store them in the index.It is defined as the time period from when Splunk receives new data to the data that is written to a Splunk index.

37. List the pros and cons of Summary Indexing?

Pros:

  1. The summary index will retain the analytics and reports even after the data has aged out.
See also  Top 100 MySQL Interview Questions And Answers

Cons:

  1. Hayrick kind of a search is not possible.
  2. Deep dive analysis is not possible.

38. Write the command to stop and start Splunk service?

The command to start Splunk service: ./splunk start

The command to stop Splunk service: ./splunk stop

39. Can you differentiate between Splunk App and Add-on?

Splunk AppAdd-on
These are mainly used for visualization analysis and Representation.They are mainly used for data optimization and collection process in order to increase efficiency.

40. Define data ages in Splunk?

The data that comes into the indexer is usually stored in directories known as buckets. A bucket is moved through several stages as the data ages(i.e., hot, warm, cold, frozen, and thawed). Over time, buckets will ‘roll’ from one stage to the next stage.

Splunk Interview Questions and Answers

41. How do we assign color nodes in a chart that are based on field names in Splunk?

The color needs to be assigned to the charts when reports are created. But, if colors are not assigned, then the colors are picked by default.

Here are the steps to assign the colors:

  1. Edit the panels that are built on top of the dashboard.
  2. You have to modify the panel settings from the UI
  3. You have to select and choose the colors

       OR you can go with the below statements.

  1. We can write commands to select the colors from the palette by inputting the hexadecimal values or by writing the code.
  2. You have to provide various gradients and set values into the radial gauge or water gauge.

42. How can one clear the search history?

In order to clear the s.earch history, you have to delete the below-given file from the Splunk server:

$splunk_home/var/log/splunk/searches.log

43. In Splunk, how can we exclude some events from being indexed?

Sometimes, You do not want to index all the events in the Splunk instance. In such a case, how can we exclude the entry of events to Splunk?

Consider an example of debug messages in the application development cycle. We can exclude these debug messages by placing those events in the null queue. These null queues are later put into transforms.conf at the forwarder level.

44. Can you define Btool in Splunk?

Splunk software configuration files, also known as conf files, will be loaded and merged to make the working set of configurations used by Splunk software while performing certain tasks. The btool command will simulate the merging process by using the on-disk conf files, and it will create a report showing the merged settings.

45. Define Splunk App? 

A Splunk app can be defined as an extension of the Splunk functionality that has its own in-built UI context that serves a specific requirement. Usually, the Splunk apps are made up of various Splunk knowledge objects such as lookups, event types, tags. Apps can utilize or leverage other apps or add-ons.

Splunk Interview Questions and Answers

46. Define a Fishbucket?

In Splunk, the Fishbucket is defined as a sub-directory mainly used to monitor or track internally how far the content of the file is indexed. It usually has two contents to achieve this feature, like seek pointers and CRC.

47. Define Dispatch directory?

The dispatch directory in Splunk stores artifacts on the nodes where searches are run. These nodes will include search peers, search heads, and standalone Splunk Enterprise instances. 

48. How to add local logs to Splunk /forwarder?

One can add windows system/application/security/IIS and scripted input by using the below method:

  1. First, log in to Splunk.
  2. Then, you need to go to settings and click on the  Data inputs under the Data title.
  3. If you want to add local events, click on the add new/edit in front of the column local event log collection if the logs are available on a local machine.
  4. Now, select the logs that need to be monitored and select the index name for which you want to store logs, and click on save. Once saved, you can search the local logs through the Splunk GUI for any errors.

49. Define the configuration files precedence in Splunk?

The precedence of configuration files in Splunk:

  1. System Local Directory (highest priority)
  2. App Local Directories
  3. App Default Directories
  4. System Default Directory (lowest priority)

Splunk Interview Questions and Answers

50. Name the latest Splunk version in use?

Splunk 8.1.3 (Check the latest version)

51. List the types of Splunk Licenses?

  1. Enterprise license
  2. Free license
  3. Forwarder license
  4. Beta license
  5. Licenses for search heads (for distributed search)
  6. Licenses for cluster members (for index replication)

52. Where is the Splunk Default Configuration stored?

$splunkhome/etc/system/default

53. Can you name the top direct competitors to Splunk?

A few of the top direct competitors to Splunk:

  1. Logstash
  2. Loggly
  3. LogLogic
  4. Sumo Logic, etc. 

54. From a licensing perspective, how does Splunk determine one day?

In terms of Splunk licensing, one day is measured as from midnight to midnight on the clock of the license master.

55. List the features that are not available in Splunk Free?

  1. Distributed search
  2. Authentication and scheduled searches/alerting
  3. Forwarding in TCP/HTTP (to non-Splunk)
  4. Deployment management

Splunk Interview Questions and Answers

56. Define Transaction commands?

We use the  transaction command in two specific cases:

  1. When a unique ID (from one or multiple tables fields) alone is not sufficient to differentiate between two transactions, then it is a case when the identifier will be reused, for example, web sessions that will be identified by a cookie or client IP. In such a case, the time span or pauses are used to segment data into transactions.
  2. When an identifier is reused, i.e., in the DHCP logs, a specific message identifies the beginning /end of a transaction.
  3. When we want to see the raw text of events that are combined rather than the analysis of the constituent fields of events

57. Can we purchase Forwarder Licenses?

There is no need to purchase Forward Licenses because they are included with Splunk.

58. Write the command used to check the running Splunk processes on Unix/Linux?

If you want to check the running Splunk Enterprise processes on Unix/Linux, then make use of the following command:

ps aux | grep Splunk

59. List the data source types in Splunk?

  1. Files and directories
  2. Network events
  3. Windows sources
  4. Other sources

Splunk Interview Questions and Answers

60. Name the command for restarting Splunk Daemon?

Splunk Deamon can be restarted with the command:

Splunk start splunkd

61.Name the command for restarting the Splunk web server?

We can restart the Splunk web server by using the command:

Splunk start splunkweb

62.How can we disable Splunk boot-start?

In order to disable Splunk boot-start, use the command::

$SPLUNK_HOME/bin/Splunk disable boot-start

63.How can we disable Splunk Launch Message?

You have to set the value OFFENSIVE=Less in splunk_launch.conf.

64. How can we set the default search time in Splunk 6?

In Splunk Enterprise 6.0, we need to use ‘ui-prefs.conf’. If we set the value, all our users would be able to see it as the default setting:

$SPLUNK_HOME/etc/system/local

65. Define MapReduce Algorithm?

MapReduce is defined as a programming model for processing huge data sets with a parallel, distributed algorithm on the cluster.

Apart from technical Interview questions, you have to prepare yourself for some general questions also. I have listed a few. Make sure you go through them as well.

Splunk Interview Questions and Answers

66. Tell me about yourself?

67. Why should we hire you?

68. Tell me something about our company?

69. Tell me about your strengths and Weaknesses?

70. What makes you stand out from others?

Good luck with your Splunk Interview, and we hope our Splunk Interview Questions and Answers were of some help to you. You can also check our Qlikview Interview Questions.

Recommended Articles